0 My name is Jeremy Evans, and I'm here to talk about Sequel, the database toolkit for ruby. 1 Sequel has only been around two years, and I don't know much about the first year. I was only using Sequel a month before becoming maintainer last March. I didn't read much of Sequel's source until after I became maintainer, and what I found distressed me. The basic design was good, but the implementation was messy. Many features were implemented by overriding methods, there was very little documentation, and all methods were public. The advanced filtering at the time required ParseTree, and the code for that was the ugliest. I deprecated some things I thought were obviously bad ideas, and worked on a replacement for the ParseTree filters. 2 I added expression filters in 2.0, which let you write advanced filters without requiring ParseTree or handwritten SQL. I removed ParseTree support in 2.2, which allowed me to support Ruby 1.9 and JRuby in 2.3. I added prepared statement and sharding support in 2.4, which was impressive as it came only two weeks after 2.3. Since then, there have been a lot of features added and a lot of bugs fixed. 3 Sequel is split into two parts, core and model. This is different than the DM core/more split, where more adds features to core. sequel_core is a ruby interface to an SQL database. It's great for generating reports or dealing with sets of rows instead of single rows. sequel_model is an object relational mapper built on top of core. model classes are built on top of core datasets, so you get the benefits of core when using model. At an instance level, model is similar to AR or DM in basic usage. model depends on core, but core does not depend on model, and you could probably use a core dataset to return AR objects if you wanted to. 4 Sequel currently supports 13 database adapters. Some databases are supported by multiple adapters, and some adapters support multiple databases, though the best supported adapters are the native ones. I test MySQL, PostgreSQL, and SQLite using the native, JDBC, and DataObjects adapters, all of which have good support. I know Sequel users who are using the ADO, Firebird, ODBC, and Oracle adapters successfully, and I haven't received bug reports about the others. 5 One reason Sequel supports many adapters is that the adapters are easy to write. Only 5 methods are required. Adapters range in size from 50 lines for OpenBase to 1200 for PostgreSQL. The PostgreSQL adapter supports 3 different native drivers, native prepared statements, and even methods to add database triggers and functions, which is the reason it is so large. 6 Adding 5 methods is enough return results, but not enough for full support. Full support requires the delete and update methods return the number of rows modified and the insert method return the autoincrementing primary key value. Getting insert fully supported usually takes the most time. If you are adding support for a new database, you have to deal with the SQL syntax differences. Sequel makes this as easy as possible, as it is designed for flexibility over performance. You generally just have to override a few short methods. Running the integration tests is the best way to test adapter support. 7 After requiring sequel, you should create a Database object via the connect method. The convention for an application using a single database is to store the database object in a constant named DB. The main use for this object is to create dataset objects for queries, but it handles transactions and some other things too. 8 For example, You can use it to change the default settings for datasets and add SQL loggers. The transaction method is the only way to use an SQL transaction. It ensures that all access to the database inside the block uses the same connection inside a transaction. 9 Each database object has a separate thread safe connection pool. Sequel was well designed to use the connection pool when I became maintainer. I've added a few features, but most of it is the same. It is designed for high concurrency, where Sequel uses the pool for the least amount of time necessary. Sequel doesn't checkout a connection until the final SQL string is ready to be sent to the database, and returns the connection as soon as it is finished iterating over the results. Sequel ensures that connections are not leaked, so you don't need to cleanup connections manually or use a reaper thread. There's also a single threaded pool that's faster if thread safety is not important. 10 The Sequel dataset is really its defining object, and by that I mean it is what separates Sequel from other ruby database libraries. It uses a functional style API where you build queries by chaining methods, and it doesn't send the query to the database until you ask it to. The thing to note here is the lack of any SQL strings. You can code complex database applications in Sequel without ever writing any SQL code, if you don't stray from the SQL standard. You can use Sequel and code all of your SQL by hand too, if you want. 11 To request records, you use each or all. Each yields the rows as they arrive, and all loads all rows first. Some adapters buffer all rows in memory first, but with others, each works on a million record dataset without loading everything in memory. You can set an arbitrary proc called the row proc to call with the row before yielding, which is how model instances are returned by datasets. Datasets do no caching of records, calling each twice sends two queries. 12 Inserting, updating, and deleting records should be intuitive. Be careful with delete and update, as they affect all rows in the receiver. Remember this: filter first, delete second. You should note the usage of symbols for database columns. Sequel generally treats ruby strings as SQL strings and ruby symbols as SQL columns. 13 Sequel supports simple filters using strings or hashes. Sequel uses hashes for equality and inclusion. See here the difference between the use of a string and symbol for ln. 14 Sequel allows you to write your SQL queries as ruby expressions, without using ParseTree, by adding methods that ruby does not define. These examples show that Sequel knows about boolean logic, leading to cleaner looking SQL. The subselect example shown here shows how datasets can be used with other datasets. Even these filters are just the tip of the iceberg, queries can get much more complex than this, and Sequel handles them just fine. Putting your objects directly in your queries feels natural to me, moreso than using strings with placeholders. The expression syntax is terse, but hopefully intuitive. Be aware that one issue with using the bitwise operators in place of the logical ones is that you often need to use extra parentheses. 15 If you do string manipulation in the database, Sequel can help. It supports concatenating strings and doing LIKE searches. It also supports full text searching on some databases. 16 I mentioned that Sequel uses symbols for columns, but it also uses them for tables and schemas. Because columns often need to be qualified or aliased, you can specify the qualifier or alias in the symbol itself. A double underscore separates a table from a column, and a triple underscore separates a column from an alias. You can use methods that do the same thing. You can also use this for schema qualified tables, which Sequel supports well. 17 Sequel makes joining tables easy and flexible. You need to provide the join type, table, and conditions. The typical foreign key to primary key join uses a hash with the key being the column symbol in the table being joined, and the value being the column symbol in the current table. You can use complex conditions with an expression filter. You can also use a using join or natural join. 18 Sequel has helper methods for common join types. If you are doing multiple joins in your query, often you will have a condition that refers to a previous table instead of the current table or the table being joined. If that is the case, you need to qualify that column yourself, as otherwise Sequel will qualify it incorrectly. 19 Unfortunately, joining tables creates problems when multiple tables have columns with the same names, which happens a lot. Since Sequel returns rows as hashes, later columns end up clobbering earlier columns. This is one of the reasons that has_and_belongs_to_many associations don't work on model join tables in AR. 20 You can select only certain columns and/or alias columns manually, but that's a pain. Wouldn't it be great for something else to do the work and give you something useable back? 21 That's what graphing does. Graphing aliases for you, and returns rows as a hash of hashes. The main hash has table symbol keys, and the subhashes have column symbol keys and column values. This is a lot easier than aliasing everything yourself. I think graphing is one of Sequel's unique features, and it makes dealing with table joins at a row based level much easier. While graphing is great, there are cases when joining and aliasing manually is better. 22 Onto Sequel models. Remember model classes are backed by dataset instances, and model instances represent single rows in the dataset. Sequel doesn't require that you use a plain table for a model, you can use a joined dataset if you want. Generally a plain table is best if you want to create, update, or delete rows. You can use an ordered dataset to return all rows in a certain order by default, or a filtered dataset to restrict the model to a subset of rows. 23 Sequel didn't have associations when I started using it, people generally just wrote their own instance methods. It's easy to do that in Sequel, but if you want caching, callbacks, reflection, eager loading, and methods to add/remove associated objects, it's a lot more work. Associations handle all that for you. They are created with method names that reflect the database relationship, rather than imply ownership. 24 Sequel does not use proxies for associations. The many to one association method returns either the associated object or nil, and the to many association method always returns an array. Most uses of a proxy can be handled using the association dataset method, which returns a dataset that you can filter, reorder, or otherwise modify. The many to one association adds a setter method, and the to many associations add add association, remove association and remove all association methods. 25 I chose to not use proxies because they are more difficult to implement, understand, and reason about. With the association dataset method, there really isn't a need for proxies. The add, remove, and remove all instance methods are simple and descriptive. Note that the add and remove methods affect only the association, the remove method doesn't delete the passed object from the database. 26 Some people complain that association_dataset is an ugly method name. Maybe so. However, accessing the dataset should be less frequent than accessing the associated objects. If you find yourself using the dataset a lot with different filters, you should add multiple associations with those filters built in. Doing so leads to more descriptive code. I added the clone option to make this easier. 27 Clone is only one of around 30 current association options. Most users will only use a small number of the options, but the numerous options exist to give users near complete control over all aspects of the association. All association defining methods also take a block that yields a dataset that you can modify if there isn't an option available, which you can use to make an association a union of multiple datasets, for example. 28 Many options affect eager loading, which I'll discuss soon. Sequel provides the same association callbacks as AR, and also adds an after_load callback, which I use in an accounting app to figure out if each associated entry is a debit or a credit to the current account. You can use the :extend option to extend the association dataset with a module. 29 Since Sequel is a toolkit, flexibility is important, so the added methods are easy to override. You just override them and call super, no aliasing required. The modification methods are split into public and private methods. The private method starts with an underscore and does the query. The public method handles caching and callbacks. If you want the setter method to accept a hash in addition to a model object, you override the public method and call super. If you want to set an additional column in the database, you override the private method. 30 Sequel's eager loading gives you choice whether to load associations in a separate query with eager or use a join with eager_graph. Using eager is recommended. If your association uses a different database, you must use eager, if your order or filters use columns in the associated table, you must use eager_graph. Both have the same API and can be used simultaneously. I chose to use two methods because they do very different things, and to leave the choice to the user. Also, Sequel never attempts to parse SQL, so it wouldn't be able to guess which method was appropriate. 31 Sequel only supports three association types natively, but you can support any other association type using the :dataset option, which takes a proc that is instance evaled. The has many through has many association just uses a filtered eager graphed dataset. The has many through belongs to isn't shown here because it is handled by the built in many to many association. You can also create much more complex associations, such as joining on any of multiple keys through a third table. That is not a hypothetical example made purposely difficult, I use it in one of my apps. AR can actually handle that associations using custom SQL. However, 32 it can't eager load it. Sequel supports eager loading of custom associations using the eager loader option. This option takes a proc that accepts a key hash, an array of current objects, and the dependent associations to eagerly load. The key hash is an optimization. It's a hash of hashes, with keys being columns. The subhashes have column value keys with values being an array of related objects. Let me break down how this works. First, you get the subhash for the primary key and set the cached invoices for all given firms to the empty array. You then get all invoices for all clients of all the firms being eagerly loaded, using the keys of the subhash. For each invoice, you get the value in the subhash for the invoice's client's firm_id foreign key, and you add the invoice to the firm. This feature is powerful enough to load any association that can be eagerly loaded. An eager_grapher option is also available to allow you to set custom joins when eager graphing. 33 Sequel doesn't support polymorphic associations because I think they are a bad idea. Their only purpose is to reduce the number of tables, which is a solution to a non-problem. An association between two separate classes should have its own join table, for the same reason you use separate tables for classes even if they have the same schema. Polymorphic associations are more complex and they break referential integrity. If you must use them, there is a Sequel plugin that handles them. 34 That plugin just uses the options and techniques I've discussed, no monkey patching. Sequel's eager loader is even powerful enough to load all ancestors and decendants in a tree structure. It's my belief that Sequel has the most powerful and flexible associations of any ruby ORM. 35 Sequel currently supports validations similar to AR. My philosophy is that they should only be used for nice error messages, not data integrity. Use database constraints if you can. It supports built in validations 36 as well as custom ones you write yourself. 37 Sequel supports most of the same hooks as AR, and the usage is currently similar. I prefer database triggers to hooks for data integrity. However, if you are targeting multiple databases, using hooks and validations is probably easier. 38 Sequel's built in pagination support is just a wrapper for limit that adds a few methods. It's helpful if you are writing a search engine. each page yields a paginated version of the current dataset a page at a time. If you use an adapter that buffers all records before yielding, it can be useful to process a record set that won't fit in memory. 39 Sequel's built in caching supports caching model objects to any object with the ruby-memcache API. 40 You can use Sequel to create and alter tables. create table takes a table name and a block. You have to explicitly use primary key if you want one. You can use the column method or method_missing to create columns. You can use the index method to create indexes. If you use a ruby class as a type, Sequel uses the most appropriate database type for that class. If you use a symbol, Sequel uses it directly. Sequel encourages the use of database constraints and allows you to create them with the constraint method. Sequel encourages the use of real foreign key references, so foreign key takes a table argument. You can also use composite keys if you want. 41 alter table is similar. You can add and drop columns, constraints, and indexes. Adding foreign and primary keys is easy, and you can use composite keys here too. You can also change a column's name, type, or default. 42 Most table altering methods can be called directly on the database object. You can also rename and drop tables and create and drop views using database methods. 43 Generally, the schema methods are called inside migrations. A migration proxies most methods to the database that applies it. 44 There is a individual migration API, but it's recommended to use Sequel's migrator, which deals with a directory of individual migration files. Sequel's command line tool uses the migrator with the -m switch, but you can also use the migrator API. 45 My migration philosophy is that you should usually only make schema changes in a migration, no data modification unless necessary. More importantly, keep your migrations decoupled from your models and environment, otherwise it's likely that you'll make changes that break old migrations and you won't find out till later. The migration files must use integer version numbers, I think timestamped migrations cause more problems than they solve. 46 Sequel currently supports a DM like way of specifying schemas in your models. I think this is a bad idea as it makes migrations difficult. It also goes against my philosophy. I think a model is a nice front end to the database, not that the database is just a place to store your model's data. You can use Sequel just for persistence, but that's not a design goal. Most databases I use were designed before I started using ruby. In my experience, most databases are integration databases, not application databases. If your data is important, design your app around your database, not vice-versa. 47 Let's move on to some advanced core features. Sequel supports bound variables and prepared statements, and it does so in a database independent manner. Database support for placeholders varies greatly, but with Sequel the API for all databases is the same. Sequel supports the same interface even if the database or adapter doesn't have native support. To use a bound variable, you specify the placeholder as a symbol starting with a dollar sign. When you call the statement, you provide a hash of symbols for your placeholders, without the dollar signs. You have to tell Sequel which type of statement you want to use, and for update and insert statements, you can also provide a hash of attribute values, which can also include placeholders. If there is native bound variable support, this may be faster, though you should profile and/or benchmark. The performance difference probably won't be significant unless the bound variables are expensive to literalize. 48 Prepared statement support is similar. Instead of call you use the prepare method on the dataset and give the statement a name. You can then call the statement through the returned object or through the database via the name. Prepared statements are most beneficial when the database spends a lot of time planning the same query. If you prepare it once, the database only plans it once. It also has the same literalization benefits as bound variables. You can use both of these on model datasets and get back model objects. The code for this is probably the most complex code in Sequel, and I don't recommend using it without profiling and benchmarking. If used incorrectly, it might make your app slower. 49 Sequel supports database stored procedures in the MySQL and JDBC adapters, similar to the prepared statement support. The API is a little strange, as you call the database stored procedure on a dataset object and specify a query type. It's done this way so that you can have a stored procedure that returns model objects, as well as one that one that doesn't return rows. 50 Sequel has built in support for master/slave database configurations. Moving from a single database to a master/slave configuration can be done just by adding a servers option to the connect method with a read_only key. The read_only value should be a hash or proc that returns a hash. It is merged with the other options when a connection is requested. Sequel will then use the slave databases for selects and the master database for all other queries. You can also use multiple masters or single slaves. 51 The master/slave support is mostly a subset of the generic sharding support. Sequel makes it easy to deal with a partitioned database, you can choose which server to use for any query via the server method. The connection pool handles sharding, so it is supported with all adapters. When using sharding, the tables you are using should have the same schema on each server, otherwise you should use separate Database objects. 52 Sequel doesn't override methods added by ruby, but it does add methods to the core classes. Most of these implement Sequel's DSL, though currently some exist just to ease implementation. I'm going to review the ones related to the DSL. Hashes and arrays of two pairs are used are used to specify conditions. By default, the condition entries are ANDed together. However, you can use sql_or to OR the entries. You can use sql_negate to AND the negation of the entries. You can use the bitwise negation operator to invert the conditions. The case method returns an SQL case statement, useful for conditional sums and many other things. 53 Since Sequel usually treats arrays of all two pairs as conditions, you have to call the sql_array method if you want it treated as an array. The hash and symbol bitwise and and or methods are treated like a logical AND or OR. The as method is used to set up aliases. There are a few methods relating to casting values. The cast methods can take Ruby classes, similar to how database types are handled in the schema methods. The cast numeric method also tells Sequel it is a number, so it knows to treat the bitwise operators as bitwise operators in SQL. The cast string method also tells Sequel that it is a string, so addition will do concatenation. 54 Since ruby strings are usually treated as SQL strings, if you want the string to be used literally without quoting, you call the lit method. You can use placeholders in the string and call lit with arguments too. Symbol has the numeric operators defined and they operate as expected in SQL. The sql_function method creates SQL functions. If you are using ruby 1.8, you can use the array access operator for SQL functions, and you can also use the inequality operators directly on symbol. 55 You use the identifier method to handle a symbol as a single identifier, so it won't care about double or triple underscores. You use the qualify method directly qualify a column with a table. The like and ilike methods handle searching, and you can use regular expressions on PostgreSQL or MySQL. The sql_number or sql_string methods work similar to cast_numeric and cast_string, but without the casting. Sequel knows a little about how database types work. If it knows an object is a string in SQL, you won't be able to use the division operator, and the addition operator will do concatenation. Most methods added to symbol also work on the objects that Sequel creates, which is what allows you to write complex SQL queries directly in ruby. 56 Sequel ships with a command line tool called sequel. You it with either a connection string or a path to a yaml file as an argument. It gives you an irb shell with the database object already loaded in the DB constant. You can also use it to run migrations. It's great for quick access and playing with sequel. With the -L option, you can load all files in your models. It starts quickly so you can too. 57 Sequel isn't designed around the needs to web applications, but it does work well with them. Sequel's settings are fairly strict by default, most people will want to turn off raise on typecast failure, otherwise user input errors will raise exceptions before validations are called. You can turn off typecasting completely, but that is not a good solution. Using the validates not string validation, you can check that non-string columns don't have string values. If there is a string value in a non-string column and you have typecasting enabled and raise on typecast failure is false, then it is obvious that typecasting failed, so it will add a validation error. Turning off raise on save failure is necessary if you don't want to use begin and rescue to catch save failures. Turning off strict param setting will cause Sequel not to report errors if there is an attempt to access a restricted setter method in a mass assignment call. It makes debugging more difficult, so I urge caution. All of these methods can be set at a per model or per instance level. Sequel offers multiple mass assignment methods. set_only is recommended, as it only allows access the attributes you specify. This is a better approach than setting the allowed columns at a model level, as you can decide which attributes are allowed in each call. Some forms may let you set some attributes and not others, and you should make sure that no form submission can update more attributes than you specifically allow. For consistency, there is a set_restricted method, that works as you might expect, but I don't recommend its use. You can use Sequel just like AR and set allowed or restricted columns at the model level, and just use set to update the attributes. For very simple applications, that may be easiest. 58 Sequel is lightweight compared to AR or DM. sequel core 63% smaller than dm-core, and all of sequel is still 33% smaller. All of sequel is 3.5 times smaller than dm-more and 2.5 times smaller than AR. Sequel loads as fast as dm-core, and 2.5 times faster than AR or dm-more. Sequel is comparable to AR in performance from the benchmarks I've seen. Performance optimization has not been a focus during my maintenance, I've mainly worked on new features, bug fixes, and cleaning the internals. 59 Sequel is currently at version 2.11, and only a year ago it was at 1.3. There was a release every 2 to 3 weeks for a while, now there is a monthly release cycle. I take bug reports seriously. Bugs posted on the tracker generally get fixed in a day to a week. I actually try to fix bugs that don't affect me personally. You don't have to provide specs if the bug is obvious. Sequel's bug tracker is usually empty. I run the full release test suite before pushing to github, so the master branch is stable enough to use in production if you use one of the databases I test. I run all of my personal apps directly off the master branch. You should read the commit messages before pulling, they are descriptive. 60 It is very easy to contribute to Sequel. Zero bureaucracy, no plus 1s, just show me the code and tell me why it is a good idea. I accept patches via IRC, the Google Group, the bug tracker, or github. I don't accept all patches, and modify many I do accept, but you will get a response quickly. I prefer specs with new features, but I'll write the specs myself if I think a feature is a good idea. I'll even rewrite the implementation if I like the feature but not the patch. 61 The future is coming, and it is Sequel 3. As much as possible it will be compatible with Sequel 2. Like the move from 1.5 to 2.0, I will be deprecating stuff in 2.12 and removing it in 3.0. Many built-in features will be moving into plugins or extensions that still ship with sequel but will need to be required separately. The majority of this work is already completed and available in the master branch. Sequel 3 will be more lightweight, easier to use, and cleaner internally than Sequel 2. 62 That concludes my presentation. I'll open it to the floor for questions now. Thank you all for your time.