4 Matching Annotations
  1. Nov 2020
    1. Orchestrating dbt models with other computationsTo see how this works in practice, let’s look at a simple Dagster pipeline that orchestrates dbt models together with other, heterogeneous processes operating on your data. (The full code for this example is available on Github.)In this pipeline, we’ll download a raw .csv dataset from the public internet, then load it into a database using Pandas. We’ll run a series of dbt models to transform the data, run our dbt tests on the resulting database artifacts, and produce some plots using the transformed data. Finally, we’ll upload those plots to Slack for visibility.While contrived, this example illustrates how dbt is often embedded into a larger context. Before dbt can operate on data, it has to be ingested from somewhere, using tools that lie outside of its purview. And after dbt has transformed data, that data is consumed by downstream users, who may use a wide range of technologies to do their work.The dbt solids execute alongside the other components of the pipeline. You can see below that logs emitted by the running dbt models are streamed back to a central view, along with logs produced by solids making use of other technologies.

      this type of needing to process a CSV + do database queries is not that far off from some of the work we frequently need to do.

      [[dbt]] [[dagster]]

    2. We love dbt because of the values it embodies. Individual transformations are SQL SELECT statements, without side effects. Transformations are explicitly connected into a graph. And support for testing is first-class. dbt is hugely enabling for an important class of users, adapting software engineering principles to a slightly different domain with great ergonomics. For users who already speak SQL, dbt’s tooling is unparalleled.

      when using [[dbt]] the [[transformations]] are [[SQL statements]] - already something that our team knows

    3. What is dbt?dbt was created by Fishtown Analytics to enable data analysts to build well-defined data transformations in an intuitive, testable, and versioned environment.Users build transformations (called models) defined in templated SQL. Models defined in dbt can refer to other models, forming a dependency graph between the transformations (and the tables or views they produce). Models are self-documenting, easy to test, and easy to run. And the dbt tooling can use the graph defined by models’ dependencies to determine the ancestors and descendants of any individual model, so it’s easy to know what to recompute when something changes.

      one of the [[benefits of [[dbt]]]] is that the [[data transformations]] or [[data models]] can refer to other models, and help show the [[dependency graph]] between transformatios

  2. Sep 2016