
Introduction
SODA (Scalable Open Data Analysis) is an open-source data quality testing tool that helps data engineers and analysts monitor, profile, and validate data. Installing SODA locally enables development, testing, and debugging workflows without relying on cloud-based environments. This setup is useful for validating data pipelines, catching data anomalies early, and working offline when necessary.
This guide covers different installation methods, including Docker-based deployment and language-specific package managers like pip (Python), npm (Node.js), gem (Ruby), and Maven/Gradle (Java).
Installing SODA with Docker
Using Docker simplifies dependency management and ensures a consistent runtime environment. If Docker is installed, run the following command to pull and start a SODA container:
docker run --rm -it sodadata/soda-core:latest
To mount a local directory and access custom configuration files, use:
docker run --rm -it -v $(pwd)/config:/app/config sodadata/soda-core:latest
This method is ideal for testing without modifying the local environment.
Installing SODA with pip (Python)
SODA provides a Python-based implementation, soda-core, for integrating with data pipelines. Install it using pip:
pip install soda-core
To install support for a specific database (e.g., PostgreSQL or Snowflake), use:
pip install soda-core-postgres
pip install soda-core-snowflake
Verify the installation:
soda scan -h
Installing SODA with npm (Node.js)
For JavaScript/TypeScript projects, install the SODA client using npm:
npm install soda-core
If using Yarn:
yarn add soda-core
Verify the installation:
npx soda scan -h
Installing SODA with gem (Ruby)
If a Ruby integration exists, install it via RubyGems:
gem install soda-core
Verify the installation:
soda scan -h
Installing SODA with Maven or Gradle (Java)
For Java-based projects, add the SODA dependency to your pom.xml (Maven):
<dependency>
<groupId>com.soda</groupId>
<artifactId>soda-core</artifactId>
<version>latest</version>
</dependency>
For Gradle, add to build.gradle
dependencies {
implementation 'com.soda:soda-core:latest'
}
Run a test scan:
java -jar soda-core.jar scan -h
Managing and Verifying the Installation
To check that SODA is installed correctly, run:
soda scan -h
To update SODA, use:
pip install --upgrade soda-core
npm update soda-core
gem update soda-core
Conclusion
Installing SODA locally enables data validation and quality testing across multiple environments. Docker provides an isolated setup, while package managers allow for easy integration with existing projects. Verifying the installation ensures smooth operation in data pipelines.
Last Releases
- v3.5.5What’s Changed Update README.md with launch banner by @santiviquez in #2292 Fix authentication inside Fabric Notebooks by @sdebruyn in #2299 Add dotenv to deps, fixes #2285 by @m1n0 in #2312… Read more: v3.5.5
- v4.0.0b1v4.0.0b1 Source: https://github.com/sodadata/soda-core/releases/tag/v4.0.0b1
- v3.5.4v3.5.4 Source: https://github.com/sodadata/soda-core/releases/tag/v3.5.4