Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Version History

« Previous Version 5 Next »

On this page:

Apache Superset is an open-source analytical tool, similar to PowerBI, used to build powerful dashboards with user-friendly visualizations.


Installation Steps

To execute Docker commands, a user with sudo privileges is required. If the root user is accessible, there is no need to add the ‘sudo’ instruction.

  1. Clone Superset GitHub repository and navigate to the folder.

    git clone --depth=1  https://github.com/apache/superset.git
    cd superset/
  2. Run the following command to specify the Superset version to download.

    export TAG=4.0.2
  3. Modify Superset’s configuration file before building the Docker containers.

    1. Go to superset/docker/pythonpath_dev

    2. Replace the file superset_config.py with the following configuration:

      # Licensed to the Apache Software Foundation (ASF) under one
      # or more contributor license agreements.  See the NOTICE file
      # distributed with this work for additional information
      # regarding copyright ownership.  The ASF licenses this file
      # to you under the Apache License, Version 2.0 (the
      # "License"); you may not use this file except in compliance
      # with the License.  You may obtain a copy of the License at
      #
      #   http://www.apache.org/licenses/LICENSE-2.0
      #
      # Unless required by applicable law or agreed to in writing,
      # software distributed under the License is distributed on an
      # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
      # KIND, either express or implied.  See the License for the
      # specific language governing permissions and limitations
      # under the License.
      #
      # This file is included in the final Docker image and SHOULD be overridden when
      # deploying the image to prod. Settings configured here are intended for use in local
      # development environments. Also note that superset_config_docker.py is imported
      # as a final step as a means to override "defaults" configured here
      #
      import logging
      import os
      
      from celery.schedules import crontab
      from flask_caching.backends.filesystemcache import FileSystemCache
      
      logger = logging.getLogger()
      
      DATABASE_DIALECT = os.getenv("DATABASE_DIALECT")
      DATABASE_USER = os.getenv("DATABASE_USER")
      DATABASE_PASSWORD = os.getenv("DATABASE_PASSWORD")
      DATABASE_HOST = os.getenv("DATABASE_HOST")
      DATABASE_PORT = os.getenv("DATABASE_PORT")
      DATABASE_DB = os.getenv("DATABASE_DB")
      
      EXAMPLES_USER = os.getenv("EXAMPLES_USER")
      EXAMPLES_PASSWORD = os.getenv("EXAMPLES_PASSWORD")
      EXAMPLES_HOST = os.getenv("EXAMPLES_HOST")
      EXAMPLES_PORT = os.getenv("EXAMPLES_PORT")
      EXAMPLES_DB = os.getenv("EXAMPLES_DB")
      
      # The SQLAlchemy connection string.
      SQLALCHEMY_DATABASE_URI = (
          f"{DATABASE_DIALECT}://"
          f"{DATABASE_USER}:{DATABASE_PASSWORD}@"
          f"{DATABASE_HOST}:{DATABASE_PORT}/{DATABASE_DB}"
      )
      
      SQLALCHEMY_EXAMPLES_URI = (
          f"{DATABASE_DIALECT}://"
          f"{EXAMPLES_USER}:{EXAMPLES_PASSWORD}@"
          f"{EXAMPLES_HOST}:{EXAMPLES_PORT}/{EXAMPLES_DB}"
      )
      
      REDIS_HOST = os.getenv("REDIS_HOST", "redis")
      REDIS_PORT = os.getenv("REDIS_PORT", "6379")
      REDIS_CELERY_DB = os.getenv("REDIS_CELERY_DB", "0")
      REDIS_RESULTS_DB = os.getenv("REDIS_RESULTS_DB", "1")
      
      RESULTS_BACKEND = FileSystemCache("/app/superset_home/sqllab")
      
      CACHE_CONFIG = {
          "CACHE_TYPE": "RedisCache",
          "CACHE_DEFAULT_TIMEOUT": 300,
          "CACHE_KEY_PREFIX": "superset_",
          "CACHE_REDIS_HOST": REDIS_HOST,
          "CACHE_REDIS_PORT": REDIS_PORT,
          "CACHE_REDIS_DB": REDIS_RESULTS_DB,
      }
      DATA_CACHE_CONFIG = CACHE_CONFIG
      
      
      class CeleryConfig:
          broker_url = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_CELERY_DB}"
          imports = (
              "superset.sql_lab",
              "superset.tasks.scheduler",
              "superset.tasks.thumbnails",
              "superset.tasks.cache",
          )
          result_backend = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_RESULTS_DB}"
          worker_prefetch_multiplier = 1
          task_acks_late = False
          beat_schedule = {
              "reports.scheduler": {
                  "task": "reports.scheduler",
                  "schedule": crontab(minute="*", hour="*"),
              },
              "reports.prune_log": {
                  "task": "reports.prune_log",
                  "schedule": crontab(minute=10, hour=0),
              },
          }
      
      
      CELERY_CONFIG = CeleryConfig
      
      FEATURE_FLAGS = {"ALERT_REPORTS": True}
      ALERT_REPORTS_NOTIFICATION_DRY_RUN = True
      WEBDRIVER_BASEURL = "http://superset:8088/"  # When using docker compose baseurl should be http://superset_app:8088/
      # The base URL for the email report hyperlinks.
      WEBDRIVER_BASEURL_USER_FRIENDLY = WEBDRIVER_BASEURL
      SQLLAB_CTAS_NO_LIMIT = True
      
      SESSION_COOKIE_SAMESITE = None
      ENABLE_PROXY_FIX = True
      PUBLIC_ROLE_LIKE = "Gamma"
      PUBLIC_ROLE_LIKE_GAMMA = True
      WTF_CSRF_ENABLED = False
      GUEST_ROLE_NAME = "Gamma"
      OVERRIDE_HTTP_HEADERS = {}
      FEATURE_FLAGS = {
          "EMBEDDED_SUPERSET": True
      }
      SQLALCHEMY_DATABASE_URI = 'postgresql://superset:superset@db:5432/superset'
      SUPERSET_FEATURE_EMBEDDED_SUPERSET = True
      ENABLE_CORS = True
      CORS_OPTIONS = {
        'supports_credentials': True,
        'allow_headers': ['*'],
        'resources':['*'],
        'origins': ['*', 'http://localhost:8088', 'http://localhost:8888'],
      }
      
      TALISMAN_CONFIG = {
          "content_security_policy": {
              "base-uri": ["'self'"],
              "default-src": ["'self'"],
              "img-src": ["'self'", "https://raw.githubusercontent.com", "blob:", "data:"],
              "worker-src": ["'self'", "blob:"],
              "connect-src": [
                  "'self'",
                  "https://api.mapbox.com",
                  "https://events.mapbox.com",
              ],
              "object-src": "'none'",
              "style-src": [
                  "'self'",
                  "'unsafe-inline'",
              ],
              "script-src": ["'self'", "'strict-dynamic'"],
          },
          "content_security_policy_nonce_in": ["script-src"],
          "force_https": False,
          "session_cookie_secure": False,
      }
      
      
      #
      # Optionally import superset_config_docker.py (which will have been included on
      # the PYTHONPATH) in order to allow for local settings to be overridden
      #
      try:
          import superset_config_docker
          from superset_config_docker import *  # noqa
      
          logger.info(
              f"Loaded your Docker configuration at " f"[{superset_config_docker.__file__}]"
          )
      except ImportError:
          logger.info("Using default Docker config...")
  4. Use Docker Compose to launch the Superset containers.

    docker compose -f docker-compose-image-tag.yml up -d
  5. After a couple of minutes, Superset should be running on port 8088. If Nginx is already configured like described here, accessing the domain on the root path will redirect to Superset.

Official installation guide: https://superset.apache.org/docs/installation/docker-compose


Configuration

Requirements

  • Have the necessary datasets in a .zip file ready for import.

  • Have the dashboard in a .zip file ready for import.

  • User must have an admin role to import items and create new database connections.

Dataset Export

Download the Items

The Zimbabwe (ZWE) dashboard uses three different datasets to allow for various visualizations:

  1. zwe_elearning_certificates_view_dev

  2. zwe_elearning_enrolmentscourseunifiedview_dev

  3. zwe_course_grades

There is also a dashboard associated with these datasets. The dashboard's export file is a .zip with the following format:
dashboard_export_[YYYYMMDD]_T[HHMMSS]

All dataset names retain their original names from the zwe_elearning_dev database.

Import the Items

It is very important to follow the correct order when importing the datasets and the dashboard.

1. Connect to the Database

Path: setting/database connection/database
Before migrating or exporting a dashboard, ensure that the database connection is properly set up.

2. Import Datasets

Path: dataset/import
Depending on the scenario, the dataset can be created via the interface or may need to be migrated to include additional elements such as metric columns. For the ZWE dashboard, the best practice is to download the dataset and upload it to the new Superset server. The order of importing the datasets is not crucial, but it is essential that the datasets required by the dashboard are available on the server.

Dashboard Import

Path: dashboard/import
Once the datasets are uploaded, you can proceed with the dashboard import. Verify that all connections are functioning correctly and ensure there are no missing connections or errors related to any metrics that the dashboard uses.

Notes

  • Always ensure that the imported datasets are properly linked to the dashboard.

  • If any metrics are missing after the import, verify the dataset configurations and the database connections.

  • If custom CSS is applied to the dashboard and doesn't work correctly, it could be due to changes in class names between different Superset versions affecting the visualizations.

  • No labels