Posts tagged databases

Jul19

Memcached, a Database?

memcached databases nosql | comments

In my QCon talk Horizontal Scalability via Transient, Shardable, Share-Nothing Resources, I argued that memcached is the father of modern shardable resources. Today’s NoSQL key-value stores all owe some part of their inspiration to memcached. Even feature-rich datastores such as CouchDB or Cassandra also borrow a cornerstone idea from memcached: throw away some features historically associated with databases in order to make big gains in scalability and resiliency.

Continue reading »

Mar15

Graph Databases

databases nosql | comments

Graph databases are a type of datastore which treat the relationship between things as equally important to the things themselves. Examples of datasets that are natural fits for graph databases:

Continue reading »

Dec27

Crystalized Innovation

databases | comments

Tim Lind writes about Innovation in Database Technology:

Continue reading »

Dec16

No Knobs

ops databases | comments

Current systems were built in an era where resources were incredibly expensive, and every computing system was watched over by a collection of wizards in white lab coats, responsible for the care, feeding, tuning and optimization of the system. In that era, computers were expensive and people were cheap. Today we have the reverse. Personnel costs are the dominant expense in an IT shop.

Continue reading »

Jul21

A New Tool for a New Problem

databases | comments

There is a misperception that if someone advocates a non-relational database, they either don’t understand SQL optimization, or they are generally a hater. This is not the case.

It is reasonable to seek a new tool for a new problem, and database problems have changed with the rise of web-scale distributed systems. This does not mean that SQL as a general-purpose runtime and reporting tool is going away. However, at web-scale, it is more flexible to separate the concerns. Runtime object lookups can be handled by a low-latency, strict, self-managed system like Cassandra. Asynchronous analytics and reporting can be handled by a high-latency, flexible, un-managed system like Hadoop. And in neither case does SQL lend itself to sharding.

Continue reading »

Jul18

Versioned Concurrency Instead of Locks

databases concurrency | comments

Locking access to the data store is one way to provide safe concurrent access, but it starts to fall over as soon as you have more than a few scripts accessing the same Storage object. We have no locks in CouchDB, accomplished by using multi version concurrency control, a storage technique popularized by the PostgreSQL database. The MVCC approach would seem to be a simpler and more humane solution to the problem of concurrent storage usage.

Continue reading »

Jul13

Scanty on Redis

scanty databases | comments

I’ve ported Scanty to Redis. If you don’t know, Scanty is the code for this blog, and Redis is a high-speed persistent key/value store with very useful extra features (like atomic list operations). Think Memcached but persistent, or Tokyo Tyrant but with more features and minus the ability to store databases larger than memory.

Continue reading »

Jul08

SQL Databases Are An Overapplied Solution (And What To Use Instead)

databases | comments

Relational databases are a great solution for storing a certain type of data that you’ll need to access in a certain way:

Continue reading »

Jul06

SQL Databases Don't Scale

databases | comments

A question I’m often asked about Heroku is: “How do you scale the SQL database?” There’s a lot of things I can say about using caching, sharding, and other techniques to take load off the database. But the actual answer is: we don’t. SQL databases are fundamentally non-scalable, and there is no magical pixie dust that we, or anyone, can sprinkle on them to suddenly make them scale.

Continue reading »

Jun29

Big Trends

databases cloud queueing | comments

The following are what I believe to be the three most important areas of radical (vs. evolutionary) innovation in web application server components over the coming 6 - 18 months:

Continue reading »

Jun21

Retwis, an Example App Without a SQL Database

databases | comments

I’m fairly convinced that the relational database paradigm - which in practice means SQL databases - is on its last leg. There are is a growing array of excellent options for document or key-value datastores which sidestep all the scaling issues and complexity of SQL databases: CouchDB, Redis, Tokyo Tyrant, Memcachedb, and MongoDB are some of the front-runners.

Continue reading »

Mar30

Export to CSV

databases sequel heroku | comments

A quick script to export a database table to CSV is below. Drop into lib/tasks/export.rake (Rails), or directly into your Rakefile, then call at the command line:

Continue reading »

Mar02

Database Versioning

databases activerecord sequel datamapper | comments

Migrations bother me.

On one hand, migrations are the best solution we have for the problem of versioning databases. The scope of that problem includes merging schema changes from different developers, applying schema changes to production data, and creating a DRY representation of the schema.

Continue reading »

Feb28

ActiveRecord Migrations Outside Rails

activerecord databases sinatra | comments

To use ActiveRecord’s migrations with Sinatra (or other non-Rails project), add the following to your Rakefile:

Continue reading »

Feb11

Taps for Easy Database Transfers

taps databases opensource | comments

Migrating databases from one server to another is a pain: mysqldump on old server -> gzip -> scp big dump file -> gunzip -> mysql. It takes a long time, and is very manual and (and thus error-prone), and generally has the stink of “lame” hanging about it.

Continue reading »

Feb03

Drizzle, Simplified MySQL Fork

databases | comments

Drizzle is a fork of MySQL 6.0. From the FAQ:

Continue reading »

Dec29

DBM Files and Tokyko Cabinet

databases | comments

I’ve always liked DBM files as a lightweight, fast, persistent datastore. For those not familiar with the format, imagine something that has an interface like memcached (key/value storage and lookup, aka a giant hashtable) but stores to a single file on disk ala SQLite. I used GDBM in my C and Perl days; these days, Ruby makes it even easier:

Continue reading »

Sep14

Database URLs

activerecord databases datamapper sequel | comments

DataMapper and Sequel both use urls for configuring database connections:

Continue reading »

Sep03

DDL Transactions

rails databases | comments

Jeremy Kemper merged my patch to wrap migrations in a transaction into Rails edge last week. (This was a two year old patch - props to Tarmo Tänav for some mad rebasing.)

Continue reading »

Jan12

ThruDB

databases | comments

I just love the incredible speed with which new technology paradigms appear and gain traction in these happy and hale days of bubbledom. Less than a month after I discovered document-oriented databases, there’s a new one in the mix: ThruDB. Don’t look at the Ruby sample code, it’ll hurt your eyes - the creators of ThruDB aren’t Rubyists. Instead, I recommend Sebastian Delmont's writeup of how it could be used with Rails. Rather than write an ActiveRecord adapter, he suggests an ActiveDocument which uses a DataMapper type pattern for the schema, and has similar methods and callback hooks for most operations. Finds would use ThruDB’s native query language, which is the Lucene syntax.

Continue reading »

Dec17

A World Without SQL

databases | comments

Amazon’s SimpleDB is brining non-relational databases (such as CouchDB and RDDB) into the spotlight. I’ve been eagerly looking forward to the paradigm in databases for some time now, having never been convinced that object databases were the next step. So this is exciting stuff.

Continue reading »