One of the most important choices we will make together is how we store your data. For some businesses, a simple excel spreadsheet might be enough, but most will need something a bit more robust. Fortunately, we have a wide variety of database options to choose from.
I’ll just say it … I like working with a local database. I’ve been working in local databases for ages and it’s where I’m the most comfortable. I like having complete control over everything and there are a lot of local databases to choose from. Let’s take a look at a few of the most common options.
Microsoft SQL Server
This is my favorite database. It’s the one I spend the most time in. Unfortunately, it’s expensive. If you have a lot of data, the cost is justified. If you have less data or you just want to get a proof of concept project out there to see if it’s a good fit you might try Microsoft SQL Server Express. SQL Server Express is free but it has some limits on the number of records it can store and the system resources it will use.
PostgreSQL is a great option if you want to save some money. PostgreSQL is a powerful database engine with all the bells and whistles you could want (and then some). This is a viable alternative to SQL Server for any small/mid-size business.
Like PostgreSQL, MySQL is open source … at least mostly open source. In newer versions, there are some proprietary modules included in MySQL. I rarely use MySQL anymore. This IS a powerful and (mostly) free database capable of handling big data, but I find working in PostgreSQL easier.
Microsoft Access doesn’t get a lot of love anymore. I cut my teeth on this database in the beginning of my career so I’ll always have a place for it in my heart. I think Access is a perfectly acceptable choice for a proof of concept project but … and this is important … you should not start a proof of concept project with Microsoft Access without budgeting time and expenses to migrate away from it and into another RDMS (relational database management system) as soon as possible.
I won’t go into a lot of detail on cloud-based options here, as the details tend to change rapidly in this category. I will say that you will find almost all of the databases mentioned above (SQL Server, PostgreSQL, MySQL) in the cloud plus a few new choices worth considering. It might surprise you, but my favorite cloud-based database isn’t SQL Server, and I LOVE SQL Server. No, if I’m going to do a project in the cloud my platform of choice today is Snowflake. Snowflake is a remarkable, cost-effective, web based platform that offers nearly infinite expansion with just a few clicks without sacrificing the elegant simplicity of the dimensional data warehouse (more on this in the section below).
Schema is tech talk for model or layout, so having picked a database, our next step is to decide HOW we will store our data. There are two common methods for storing data in an analytics database. We could create an operational data store or a data warehouse.
Operational Data Store
The operational data store (ODS) looks very much like the source system it is derived from. The ODS is refreshed frequently, and it does not save a history of changes. The ODS can be convenient for simple reporting, but more complex reports that bring data together from different business groups or business processes can be difficult. I often use an existing ODS database as staging for the development of a new data warehouse.
Just the words data warehouse can frighten an experienced IT pro. I’ve heard statistics that put the failure rate of data warehouse projects north of 70%. That is alarming, to say the least. Fortunately, many risks associated with data warehouse projects can be mitigated with some careful planning and a pragmatic approach to design. My method is to start small, build on success. Build just what we need, when we need it and add to it as we go so that each part we build can support parts we plan to build in the future. When done right, a data warehouse project will start small and deliver continuous value. A data warehouse doesn’t have to be scary, it doesn’t have to take years to build and it can deliver value that simply can’t be realized any other way.