Monday, November 27, 2017

Unable To Start DSE Post 5.1 Upgrade

After upgrading our DSE servers running SPARK, from 5.0.8 version to 5.1.0 version, it refused to start with following error. 

ERROR [main] 2017-11-20 14:33:58,931 - Unable to start DSE server.
com.datastax.bdp.plugin.PluginManager$PluginActivationException: Unable to activate plugin com.datastax.bdp.plugin.DseFsPlugin

Caused by: org.apache.cassandra.exceptions.ConfigurationException: DSEFS does not support authentication method configured with org.apache.cassandra.auth.PasswordAuthenticator. DSEFS supports INTERNAL, LDAP and DIGEST authentication schemes configured with DseAuthenticator.

As you can see, the default Authenticator and Authorizer are being deprecated and now changed to following. 

authenticator: com.datastax.bdp.cassandra.auth.DseAuthenticator
authorizer: com.datastax.bdp.cassandra.auth.DseAuthorizer
role_manager: com.datastax.bdp.cassandra.auth.DseRoleManager

In the dse.yaml file, configure the corresponding options:

Configure the DSE Authenticator by uncommenting the authentication_options and changing the settings.
# authentication_options:
#     enabled: false
#     default_scheme: internal

TO -
enabled: true
default_scheme: internal

Configure the DSE Role Manager by uncommenting role_management_options and setting the mode
     mode: internal

Configure the DSE Authorizer by uncommenting the authorization_options and changing the settings.
     enabled: true
#     transitional_mode: normal
#     allow_row_level_security: false

Once done, run nodetool upgradesstables, if you have not already ran it. 

[root@hqidlinfdb36 cassandra]# nodetool  upgradesstables
WARN  14:50:54,675 Small cdc volume detected at /var/lib/cassandra/cdc_raw; setting cdc_total_space_in_mb to 243.  You can override this in cassandra.yaml
WARN  14:50:54,681 memtable_cleanup_threshold has been deprecated and should be removed from cassandra.yaml

You must configure the replication factors appropriate for using DSE Security in production environments. The keyspaces that require an increased replication factor are:
  • system_auth
  • dse_security

Change the system_auth keyspace RF:
ALTER KEYSPACE system_auth
    WITH REPLICATION= {'class' : 'NetworkTopologyStrategy',
                       'data_center_name' : N,
                       'data_center_name' : N};

  {'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2};

Change the dse_security keyspace RF:

ALTER KEYSPACE "dse_security"
   WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2};

Run the nodetool repair on the security keyspaces.
nodetool repair --full system_auth
nodetool repair --full dse_security

After changing the replication strategy, you must run nodetool repair with the --full option. when running full repair, you may see following warning where one has to modify the RF for dse_leases key space. 

insufficient replication (you created a new DC and didn't ALTER KEYSPACE dse_leases) and the duration (30000) being different (you have to disable/delete/recreate the lease to change the duration). No live replicas for lease Leader/master/5.1.dc1 in table dse_leases.leases Nodes [/] are all down/still starting.

ALTER  KEYSPACE dse_leases WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '3'}  AND durable_writes = true;

Once this is done, you are now running successfully on 5.1 version.