Fork me on GitHub

OCDQuery


Scala library for repositories generating Doobie queries from higher-kinded data

What OCDQuery does


Overly-Complicated Database Query takes a case class defined by you and turns it into:

Basically, it creates a repository basing on your (higher-kinded) case class.

Additionally, repositories can be joined together to fetch tuples of objects.

All returned results are Doobie queries.

How to use it?


Add OCDQuery dependency to your build (released only for Scala 2.12 and 2.13):

libraryDependencies += "io.scalaland" %% "ocdquery" % "0.5.0"

If your model contains only fields populated and managed by API user, define it as value:

// values has one type param with a gap, used to wrap types of all fields
final case class ConfigF[Field[_]](
  operatingSystem: Field[String],
  numberOfCores:   Field[Int]
)

or if it has fields populated by database - assigned IDs, auto increment, DEFAULT now(), etc - (which can be tied to a records lifecycle) use entity:

import java.time.LocalDate
import java.util.UUID
// entities has two type params with gaps:
// - the first to fields passed manually on create,
// - the other fields created by database
final case class UserF[NormalField[_], CreatedField[_]](
  id:        CreatedField[UUID],
  name:      NormalField[String],
  surname:   NormalField[String],
  createdAt: CreatedField[LocalDate]
)

then create the repository out of it:

import cats.Id
import io.scalaland.ocdquery._
import shapeless.Generic

val ConfigFieldRepo = {
  implicit val configRead = QuasiAuto.read(Generic[ConfigF[Id]])
  Repo.forValue[ConfigF]("cfgs".tableName, DefaultColumnNames.forValue[ConfigF])
}

val UserRepo = {
  implicit val userRead = QuasiAuto.read(Generic[UserF[Id]])
  Repo.forEntity[UserF]("users".tableName, DefaultColumnNames.forEntity[UserF])
}

and you are set up!

Configuring column mappings

Default column mappings returned by DefaultColumnNames assign the field name from the case class as the name of the column it is mapped to. You can modify it by updating all fields globally:

import io.scalaland.ocdquery._

DefaultColumnNames.forValue[ConfigF].updateColumns("_" + _)
DefaultColumnNames.forValue[ConfigF].updateColumns(_.toLowerCase)
DefaultColumnNames.forValue[ConfigF].snakeCaseColumns

or modify just the single field:

DefaultColumnNames.forValue[ConfigF].copy[ColumnName](
  operatingSystem = "os".columnName
)
DefaultColumnNames.forEntity[UserF].copy[ColumnName, ColumnName](
  name = "name_".columnName
)

is it a but ugly, though, which is why I recommend using lenses library like Quicklens

import com.softwaremill.quicklens._
import io.scalaland.ocdquery._

DefaultColumnNames.forValue[ConfigF]
  .modify(_.operatingSystem).setTo("os".columnName)
DefaultColumnNames.forEntity[UserF]
  .modify(_.name).setTo("name_".columnName)

or Monocle:

import io.scalaland.ocdquery._
import monocle.macros.syntax.lens._

DefaultColumnNames.forValue[ConfigF]
  .lens(_.operatingSystem).set("os".columnName)
DefaultColumnNames.forEntity[UserF]
  .lens(_.name).set("name_".columnName)

(I will use Quicklens for the following examples).

Inserting new data

In order to insert new data you just need to pass it into .insert method:

// value example - value looks the same during creation as later
ConfigRepo.insert(
  ConfigF[Id]("Windows", 4)
).run

// entity example
// entity has some fields created, so they kind of "disappear" from the type
// - in such case it is prettier to create them from a tuple containing
// only required fields
UserRepo.insert(
  Create.forEntity[UserF].fromTuple(("John", "Smith"))
).run

Fetching existing data

To fetch existing data you need to use .fetch and specify some filter:

import io.scalaland.ocdquery.sql._ // definitions common for all SQLs

// fetch all
ConfigRepo.fetch(_ => emptyFiler).to[List]

// filter by column=value/column<>value
ConfigRepo.fetch(_.operatingSystem `=` "Windows").to[List]
ConfigRepo.fetch(_.operatingSystem <> "Windows").to[List]

// filter by column1=column2/column1<>column2
UserRepo.fetch(cols => cols.name `=` cols.surname).to[List]
UserRepo.fetch(cols => cols.name <> cols.surname).to[List]

// filter using IN
ConfigRepo.fetch(_.numberOfCores.in(2, 4, 6)).to[List]

// filter using (NOT) BETWEEN
ConfigRepo.fetch(_.numberOfCores.between(4 8)).to[List]
ConfigRepo.fetch(_.numberOfCores.notBetween(4 8)).to[List]

// filter using LIKE
ConfigRepo.fetch(_.operatingSystem.like("Windows%")).to[List]

// AND, OR and NOT are also available
UserRepo.fetch { cols =>
  (cols.name `=` "John") and (cols.surname `=` "Smith")
}.option

You can also sort and paginate results:

UserRepo
  .fetch
  .withSort(_.surname, Sort.Ascending)
  .withOffset(5)
  .withLimit(10)(_ => emptyFiler).to[List]

or calculate the size of the result, or check if it would be not empty:

UserRepo.count(_ => emptyFiler).unique
ConfigRepo.exists(_.operatingSystem `=` "BeOS").unique

Updating existing data

In order to .update data you have to filter which rows should be updated and define which should be set to specific values:

UserRepo.update(_ => emptyFilter)(UserRepo.emptyUpdate).run

emptyUpdate is similar to DefaultColumnNames - it can be modified using .copy or (my suggestion) by lens:

UserRepo.update(_.surname `=` "Smith")(
  UserRepo.emptyUpdate.modify(_.name).setTo("Johnny")
).run

Deleting data

Delete works similar to filtering except it will remove all that it matches instead of returning:

UserRepo.delete(_.name <> "John").run

Fetching tuples

When you take a Repo and call a .join on it, you will obtain Fetcher, which cannot modify data, but it can fetch a tuple of types from both repositories. Fetcher can be again combined with a Repo to obtain a Fetcher of triple, then quadruple, etc:

UserRepo.join(UserRepo) // Fetcher of (User, User)
UserRepo.join(UserRepo).join(UserRepo) // Fetcher of (User, User, User)

during joining you can define the type of JOIN by passing JoinType and then you can define ON condition(s):

UserRepo.join(UserRepo, JoinType.OUTER)
  .on(_._1.name, _._2.surname) // columns for ON come from tuple

Once you build a Fetcher you can .fetch data using the same way it works with repository (except that now you extract columns from tuple):

UserRepo.join(UserRepo).join(UserRepo).fetch { cols =>
  (cols._1.surname <> cols._2.surname) and
    (cols._1.surname <> cols._3.surname) and
    (cols._2.surname <> cols._3.surname)
}.to[List]

How it works?


Initial idea

Imagine you wanted to generate queries from data objects.

import java.util.UUID
import java.time.LocalDate

final case class Ticket(
  id:      UUID,
  name:    String,
  surname: String,
  from:    String,
  to:      String,
  date:    LocalDate
)


import doobie._
import doobie.implicits._

object TicketRepo {
  // ...
}

You may want to be able to insert new entity:

final case class TicketCreate(
  id:      Unit,   // placeholder as this is created by database
  name:    String,
  surname: String,
  from:    String,
  to:      String,
  date:    LocalDate
)

object TicketRepository {
   // ...
   def insert(entity: TicketCreate): Update0 = ???
   // ...
}

You may want to be able to fetch data using some obligatory part of the fetching (e.g. ID) and maybe some optional parts (having other field equal to some value):

final case class TicketFilter(
  id:      Option[UUID],     // where id      = [this value if set]
  name:    Option[String],   // and   name    = [this value if set]
  surname: Option[String],   // and   surname = [this value if set]
  from:    Option[String],   // and   from    = [this value if set]
  to:      Option[String],   // and   to      = [this value if set]
  date:    Option[LocalDate] // and   data    = [this value if set]
)

object TicketRepository {
   // ...
   def update(entityFind: TicketFilter): Query0[Ticket] = ???
   // ...
}

You might want to update existing entity using case class - this you make building queries easy.

final case class TicketUpdate(
  id:      Option[Unit],     // set uuid    = [this value if set],
  name:    Option[String],   //     name    = [this value if set],
  surname: Option[String],   //     surname = [this value if set],
  from:    Option[String],   //     from    = [this value if set],
  to:      Option[String],   //     to      = [this value if set],
  date:    Option[LocalDate] //     data    = [this value if set]
)

object TicketRepository {
   // ...
   def update(entityUpdate: TicketUpdate): Update0 = ???
   // ...
}

And you might want to delete entity using fetch criteria:

object TicketRepository {
   // ...
   def delete(entityDelete: TicketFilter): Query0[Ticket] = ???
   // ...
}

In all these cases having some object that defines how query will be performed might be really useful. We could get a lot of queries almost for free.

All these case classes can easily get out of sync, so it would be good if something could enforce that adding field in one place will require adding it somewhere else.

If we operated under that assumption, we would be able to derive queries automatically. That is as long as we had two more pieces of information table name and:

final case class TicketColumns(
  id:      String,
  name:    String,
  surname: String,
  from:    String,
  to:      String,
  date:    String
)

Idea of a refactor

If we take a closer look, we'll see that we have 3 case classes that are virtually identical - if we were able to flip the type of some fields from A to Option[A] to turn entity to update, or all of them to String to turn it into column configuration, we would reduce the amount of code and fixed issue of case class synchronization.

And we can do this! The idea is called higher-kinded data and looks like this:

import java.time.LocalDate
import java.util.UUID

type Id[A] = A // removed F[_] wrapper
type UnitF[A] = Unit // make fields not available at creation "disappear"
case class ColumnName[A](name: String) // preserves name of column and its type

// F is for normal columns which should be available in some what
// for all lifecycle
// C is for these that should be empty during creation and available
// from then on
final case class TicketF[F[_], C[_]](
  id:      C[UUID],
  name:    F[String],
  surname: F[String],
  from:    F[String],
  to:      F[String],
  date:    F[LocalDate]
)

// C[_] fields are Units, the rest as of type inside of F[_]
type TicketCreate  = TicketF[Id, UnitF]
// all fields are of the type inside of F[_]/C[_]
type Ticket        = TicketF[Id, Id]
// all fields are of Option of inside of F[_]/C[_]
type TicketUpdate  = TicketF[Option, Option]
// all fields are column names
type TicketColumns = TicketF[ColumnName, ColumnName]

Higher-kinded data is data with higher-kinded types as type parameters.

This library is about taking this higher-kinded data definition and generating basic CRUD queries for it.

Implementation

During implementation some decisions had to be made:

  • to avoid writing EntityF[Id, Id], EntityF[Updatable, Updatable] and EntityF[Id, UnitF] manually, some type aliases were introduced:
    import io.scalaland.ocdquery._
    
    type TicketCreate = Repo.ForEntity[TicketF]#EntityCreate
    type Ticket       = Repo.ForEntity[TicketF]#Entity
    type TicketUpdate = Repo.ForEntity[TicketF]#EntityUpdate
  • if you want to extend filtering DSL you can write your own extension methods like:
    implicit class MyNewFiltering(columnName: ColumnName[Int]) {
    
      def <(number: Int): Filter = () => columnName.fragment ++ fr"< $number"
    }
    
  • How to extend DSL?


    Actually, repositories doesn't assume much about the types they work on. They only assume that RepoMeta instances follow their contracts, that is, they return reasonable doobie.Fragment values for each input.

    new Repo(repoMeta) // Repo only uses functionality of RepoMeta

    RepoMeta on the other hand rely on values derived from types - derivation of these type classes uses by RepoMeta is where the library actually relies on the fact the we use higher-kinded data. BUT, if you write them yourself, you are free to do it for any combination of types that makes sense for you.

    // common operations for all kinds of Repos
    sealed trait RepoMeta[Create, Entity, Update, Names] {
      val table:       Fragment                 // table name, with or without alias
      val columnNames: ListSet[ColumnName[Any]] // required by SELECT
    
      // turns different types into ColumnName -> column value Map
      val fromCreate: Create => ListMap[ColumnName[Any], Fragment]
      val fromEntity: Entity => ListMap[ColumnName[Any], Fragment]
      val fromUpdate: Update => ListMap[ColumnName[Any], Fragment]
    
      lazy val * : Fragment = ... // * for SELECT
    }
    
    // Repo which isn't using aliases in FROM - for simple SELECT without JOINs
    sealed trait UnnamedRepoMeta[Create, Entity, Update, Names]
        extends RepoMeta[Create, Entity, Update, Names] {
    
      // main use case: for F=Id pass Names => ColumnName
      // to obtain e.g. column in ORDER BY
      def unnamedColForNames[F[_]: Functor](
        f: Names => F[ColumnName[Any]],
        prefix: Option[String] = None
      ): F[ColumnName[Any]]
      // used for filtering
      def unnamedColForFilter(f: Names => Filter,
                              prefix: Option[String] = None): Filter
    
      // provides alias in FROM
      def as(name: String): NamedRepoMeta[Create, Entity, Update, Names] = ...
    }
    
    // Repo which uses aliases - only aliased Fragments can be used in JOINed queries
    sealed trait NamedRepoMeta[Create, Entity, Update, Names]
        extends RepoMeta[Create, Entity, Update, Names] {
    
      // everything that goes into ON
      val joinedOn: Option[Fragment]
    
      // for ORDER BY
      def namedColForNames[F[_]: Functor](
        f: Names => F[ColumnName[Any]]
      ): F[ColumnName[Any]]
      // for filtering
      def namedColForFilter(f: Names => Filter): Filter
    
      // JOIN (without ON), prevents nested tuples
      // A join B => (A, B)
      // (A, B) join C => (A, B, C)
      // (A, B, C) join D => (A, B, C, D)
      // etc
      def join[C1, E1, S1, N1](
        meta:     NamedRepoMeta[C1, E1, S1, N1],
        joinType: JoinType = JoinType.Inner
      )(implicit
        cta: TupleAppender[Create, C1],
        eta: TupleAppender[Entity, E1],
        sta: TupleAppender[Update, S1],
        nta: TupleAppender[Names, N1]
      ): NamedRepoMeta[cta.Out, eta.Out, sta.Out, nta.Out] = ...
    
      // fills ON after JOIN
      def on(
        left:  Names => ColumnName[Any],
        right: Names => ColumnName[Any]
      ): NamedRepoMeta[Create, Entity, Update, Names] = ...
    }
    
    object RepoMeta {
    
      // combines derived type classes into RepoMeta
      def instant[Create, Entity, Update, Names](
        tableName: TableName,
        columns:   Names
      )(
        implicit cols: AllColumns[Names],
        forCreate:     ColumnNameFragmentList[Create, Names],
        forEntity:     ColumnNameFragmentList[Entity, Names],
        forUpdate:     ColumnNameFragmentList[Update, Names],
        updateColumns: UpdateColumns[Names]
      ): UnnamedRepoMeta[Create, Entity, Update, Names] = ...
    
      // other methods
    }

    By analysing these you should be able to understand how to define your own arbitrary Repo by defining AllColumns, ColumnNameFragmentList and UpdateColumns instances.

    type ArbitraryColumns // ColumnName[A] fields only - config and filters
    type ArbitraryCreate  // only fields present at creation
    type ArbitraryEntity  // all fields
    type ArbitraryUpdate  // Updatable[A] fields only - for defining updates
    implicit val allColumns: AllColumns[ArbitraryColumns] = ...
    implicit val forCreate: ColumnNameFragmentList[ArbitraryCreate,
                                                   ArbitraryColumns] = ...
    implicit val forEntity: ColumnNameFragmentList[ArbitraryEntity,
                                                   ArbitraryColumns] = ...
    implicit val forUpdate: ColumnNameFragmentList[ArbitraryUpdate,
                                                   ArbitraryColumns] = ...
    new Repo(RepoMeta.instance("table".tableName, allColumnsNames))

    Additionally, Filters are implemented as Single Abstract Methods () => doobie.Fragment, so you can easily write your own ColumnName[A] => Filter extension methods (see package sql).

    implicit class MyNewFiltering(columnName: ColumnName[Int]) {
    
      def <(number: Int): Filter = () => columnName.fragment ++ fr"< $number"
    }
    

    Limitations