Superset
On this page:
- 1 Installation Steps
- 2 Configuration
- 2.1 Requirements
- 2.2 Dataset Export
- 2.2.1 Download the Items
- 2.2.2 Import the Items
- 2.3 Dashboard Import
- 2.4 Notes
Apache Superset is an open-source analytical tool, similar to PowerBI, used to build powerful dashboards with user-friendly visualizations.
Installation Steps
Before starting this guide, please make sure to have installed docker on the virtual machine. To verify that docker is installed the following command must be executed: docker -v
Docker provides an official guide to install docker: Ubuntu
To execute Docker commands, a user with sudo privileges is required. If the root user is accessible, there is no need to add the ‘sudo’ instruction.
Clone Superset GitHub repository and navigate to the folder.
git clone --depth=1 https://github.com/apache/superset.git cd superset/
Run the following command to specify the Superset version to download.
export TAG=4.0.2
Modify Superset’s configuration file before building the Docker containers.
Go to superset/docker/pythonpath_dev
Replace the file superset_config.py with the following configuration:
# Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. # # This file is included in the final Docker image and SHOULD be overridden when # deploying the image to prod. Settings configured here are intended for use in local # development environments. Also note that superset_config_docker.py is imported # as a final step as a means to override "defaults" configured here # import logging import os from celery.schedules import crontab from flask_caching.backends.filesystemcache import FileSystemCache logger = logging.getLogger() DATABASE_DIALECT = os.getenv("DATABASE_DIALECT") DATABASE_USER = os.getenv("DATABASE_USER") DATABASE_PASSWORD = os.getenv("DATABASE_PASSWORD") DATABASE_HOST = os.getenv("DATABASE_HOST") DATABASE_PORT = os.getenv("DATABASE_PORT") DATABASE_DB = os.getenv("DATABASE_DB") EXAMPLES_USER = os.getenv("EXAMPLES_USER") EXAMPLES_PASSWORD = os.getenv("EXAMPLES_PASSWORD") EXAMPLES_HOST = os.getenv("EXAMPLES_HOST") EXAMPLES_PORT = os.getenv("EXAMPLES_PORT") EXAMPLES_DB = os.getenv("EXAMPLES_DB") # The SQLAlchemy connection string. SQLALCHEMY_DATABASE_URI = ( f"{DATABASE_DIALECT}://" f"{DATABASE_USER}:{DATABASE_PASSWORD}@" f"{DATABASE_HOST}:{DATABASE_PORT}/{DATABASE_DB}" ) SQLALCHEMY_EXAMPLES_URI = ( f"{DATABASE_DIALECT}://" f"{EXAMPLES_USER}:{EXAMPLES_PASSWORD}@" f"{EXAMPLES_HOST}:{EXAMPLES_PORT}/{EXAMPLES_DB}" ) REDIS_HOST = os.getenv("REDIS_HOST", "redis") REDIS_PORT = os.getenv("REDIS_PORT", "6379") REDIS_CELERY_DB = os.getenv("REDIS_CELERY_DB", "0") REDIS_RESULTS_DB = os.getenv("REDIS_RESULTS_DB", "1") RESULTS_BACKEND = FileSystemCache("/app/superset_home/sqllab") CACHE_CONFIG = { "CACHE_TYPE": "RedisCache", "CACHE_DEFAULT_TIMEOUT": 300, "CACHE_KEY_PREFIX": "superset_", "CACHE_REDIS_HOST": REDIS_HOST, "CACHE_REDIS_PORT": REDIS_PORT, "CACHE_REDIS_DB": REDIS_RESULTS_DB, } DATA_CACHE_CONFIG = CACHE_CONFIG class CeleryConfig: broker_url = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_CELERY_DB}" imports = ( "superset.sql_lab", "superset.tasks.scheduler", "superset.tasks.thumbnails", "superset.tasks.cache", ) result_backend = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_RESULTS_DB}" worker_prefetch_multiplier = 1 task_acks_late = False beat_schedule = { "reports.scheduler": { "task": "reports.scheduler", "schedule": crontab(minute="*", hour="*"), }, "reports.prune_log": { "task": "reports.prune_log", "schedule": crontab(minute=10, hour=0), }, } CELERY_CONFIG = CeleryConfig FEATURE_FLAGS = {"ALERT_REPORTS": True} ALERT_REPORTS_NOTIFICATION_DRY_RUN = True WEBDRIVER_BASEURL = "http://superset:8088/" # When using docker compose baseurl should be http://superset_app:8088/ # The base URL for the email report hyperlinks. WEBDRIVER_BASEURL_USER_FRIENDLY = WEBDRIVER_BASEURL SQLLAB_CTAS_NO_LIMIT = True SESSION_COOKIE_SAMESITE = None ENABLE_PROXY_FIX = True PUBLIC_ROLE_LIKE = "Gamma" PUBLIC_ROLE_LIKE_GAMMA = True WTF_CSRF_ENABLED = False GUEST_ROLE_NAME = "Gamma" OVERRIDE_HTTP_HEADERS = {} FEATURE_FLAGS = { "EMBEDDED_SUPERSET": True } SQLALCHEMY_DATABASE_URI = 'postgresql://superset:superset@db:5432/superset' SUPERSET_FEATURE_EMBEDDED_SUPERSET = True ENABLE_CORS = True CORS_OPTIONS = { 'supports_credentials': True, 'allow_headers': ['*'], 'resources':['*'], 'origins': ['*', 'http://localhost:8088', 'http://localhost:8888'], } TALISMAN_CONFIG = { "content_security_policy": { "base-uri": ["'self'"], "default-src": ["'self'"], "img-src": ["'self'", "https://raw.githubusercontent.com", "blob:", "data:"], "worker-src": ["'self'", "blob:"], "connect-src": [ "'self'", "https://api.mapbox.com", "https://events.mapbox.com", ], "object-src": "'none'", "style-src": [ "'self'", "'unsafe-inline'", ], "script-src": ["'self'", "'strict-dynamic'"], }, "content_security_policy_nonce_in": ["script-src"], "force_https": False, "session_cookie_secure": False, } # # Optionally import superset_config_docker.py (which will have been included on # the PYTHONPATH) in order to allow for local settings to be overridden # try: import superset_config_docker from superset_config_docker import * # noqa logger.info( f"Loaded your Docker configuration at " f"[{superset_config_docker.__file__}]" ) except ImportError: logger.info("Using default Docker config...")
Use Docker Compose to launch the Superset containers.
After a couple of minutes, Superset should be running on port 8088. If Nginx is already configured like described here, accessing the domain on the root path will redirect to Superset.
Official installation guide: Docker Compose | Superset
Configuration
Download the necessary files to set-up the ZWE eLearning Superset dashbord from the following repository: https://github.com/KnowTechTure/ZWE_Analytics
Requirements
Have the necessary datasets in a
.zip
file ready for import.Have the dashboard in a
.zip
file ready for import.User must have an admin role to import items and create new database connections.
Dataset Export
Download the Items
The Zimbabwe (ZWE) dashboard uses three different datasets to allow for various visualizations:
zwe_elearning_certificates_view_dev
zwe_elearning_enrolmentscourseunifiedview_dev
zwe_course_grades
There is also a dashboard associated with these datasets. The dashboard's export file is a .zip
with the following format:dashboard_export_[YYYYMMDD]_T[HHMMSS]
Import the Items
It is critical to follow the correct order when importing the datasets and the dashboard.
1. Connect to the Database
Path: setting/database connection/database
Before migrating or exporting a dashboard, ensure that the database connection is properly set up.
2. Import Datasets
Path: dataset/import
Depending on the scenario, the dataset can be created via the interface or may need to be migrated to include additional elements such as metric columns. For the ZWE dashboard, the best practice is to download the dataset and upload it to the new Superset server. The order of importing the datasets is not crucial, but it is essential that the datasets required by the dashboard are available on the server.
Dashboard Import
Path: dashboard/import
Once the datasets are uploaded, you can proceed with the dashboard import. Verify that all connections are functioning correctly and ensure there are no missing connections or errors related to any metrics that the dashboard uses.
Notes
Always ensure that the imported datasets are properly linked to the dashboard.
If any metrics are missing after the import, verify the dataset configurations and the database connections.
If custom CSS is applied to the dashboard and doesn't work correctly, it could be due to changes in class names between different Superset versions affecting the visualizations.