How to Install SODA Locally for Data Quality Testing

Soda

Introduction

SODA (Scalable Open Data Analysis) is an open-source data quality testing tool that helps data engineers and analysts monitor, profile, and validate data. Installing SODA locally enables development, testing, and debugging workflows without relying on cloud-based environments. This setup is useful for validating data pipelines, catching data anomalies early, and working offline when necessary.

This guide covers different installation methods, including Docker-based deployment and language-specific package managers like pip (Python), npm (Node.js), gem (Ruby), and Maven/Gradle (Java).

Installing SODA with Docker

Using Docker simplifies dependency management and ensures a consistent runtime environment. If Docker is installed, run the following command to pull and start a SODA container:

docker run --rm -it sodadata/soda-core:latest

To mount a local directory and access custom configuration files, use:

docker run --rm -it -v $(pwd)/config:/app/config sodadata/soda-core:latest

This method is ideal for testing without modifying the local environment.

Installing SODA with pip (Python)

SODA provides a Python-based implementation, soda-core, for integrating with data pipelines. Install it using pip:

pip install soda-core

To install support for a specific database (e.g., PostgreSQL or Snowflake), use:

pip install soda-core-postgres
pip install soda-core-snowflake

Verify the installation:

soda scan -h

Installing SODA with npm (Node.js)

For JavaScript/TypeScript projects, install the SODA client using npm:

npm install soda-core

If using Yarn:

yarn add soda-core

Verify the installation:

npx soda scan -h

Installing SODA with gem (Ruby)

If a Ruby integration exists, install it via RubyGems:

gem install soda-core

Verify the installation:

soda scan -h

Installing SODA with Maven or Gradle (Java)

For Java-based projects, add the SODA dependency to your pom.xml (Maven):

<dependency>
    <groupId>com.soda</groupId>
    <artifactId>soda-core</artifactId>
    <version>latest</version>
</dependency>

For Gradle, add to build.gradle

dependencies {
    implementation 'com.soda:soda-core:latest'
}

Run a test scan:

java -jar soda-core.jar scan -h

Managing and Verifying the Installation

To check that SODA is installed correctly, run:

soda scan -h

To update SODA, use:

pip install --upgrade soda-core
npm update soda-core
gem update soda-core

Conclusion

Installing SODA locally enables data validation and quality testing across multiple environments. Docker provides an isolated setup, while package managers allow for easy integration with existing projects. Verifying the installation ensures smooth operation in data pipelines.

Last Releases

  • v3.5.5
    What’s Changed Update README.md with launch banner by @santiviquez in #2292 Fix authentication inside Fabric Notebooks by @sdebruyn in #2299 Add dotenv to deps, fixes #2285 by @m1n0 in #2312… Read more: v3.5.5
  • v4.0.0b1
    v4.0.0b1   Source: https://github.com/sodadata/soda-core/releases/tag/v4.0.0b1
  • v3.5.4
    v3.5.4   Source: https://github.com/sodadata/soda-core/releases/tag/v3.5.4

More From Author

Leave a Reply

Recent Comments

No comments to show.