Understanding the Data Storage Options in HTML5

By: Brian J. Stewart

Historically web applications had very limited local storage capabilities. The only way to persist any data from a web application on the client-side was through the use of cookies. However, cookies only allowed a small amount of data (up to 4k in size) to be persisted on the client. In addition, cookies are sent to the server with each request making them inefficient for web applications. And, most importantly, there are many security implications associated with using cookies. HTML5 and related specifications provide more scalable and secure options for storing data locally:

Web Storage – simple key/value storage
Indexed Database API (IndexedDB) – robust hierarchical key/value storage that provides basic indexing and querying capabilities

Web SQL Database – client-side database with SQL language support

    Since HTML5 and it's related specifications provide options for storing data on the client and all data is not the same, therefore, it is critical that careful consideration be made to apply the right technology solution to the problem at hand. Technology is only as good as it is applied to solve a specific problem. There are several key criteria that should be considered, including: volume/size of data, permanency of data, structure and type of data, and how the data will be accessed. Based on these criteria, there are several use case scenarios where Web Storage makes sense and several use case scenarios where IndexedDB makes more sense.

    Overview of Data Storage Options

    First, let’s start with a high-level overview of the local data storage mechanisms, as well as the key differences between the data storage mechanisms: Web Storage, Indexed Database API, and Web SQL Database.

     Web Storage

    Web Storage provides a simple key/value storage mechanism. The only supported datatype for a value is a string, although more complex data can be serialized to a string and stored in Web Storage. For example, complex objects such as JavaScript objects or arrays can be converted to a string representation using JavaScript Object Notation (JSON).

    Web Storage provides two storage mechanisms Session Storage (sessionStorage) and Local Storage (localStorage). Session Storage is accessible from any web page on a website but is only persisted for the current user session. The second storage mechanism, Local Storage, is also accessible from any web page on a website but is persisted beyond the current user session. Due to the persistence of Local Storage, data is accessible each time a user visits a website.

    Web Storage only provides synchronous access to data. This means that any read/write is blocking thus larger data values will adversely affect application responsiveness and user experience.

     Indexed Database API (IndexedDB)

    IndexedDB provides a robust key/value storage mechanism with support for simple values or hierarchical objects. Also, unlike Web Storage, IndexedDB supports more advanced features such as transactions, indexing, and querying.  The fundamental storage unit for IndexedDB is an Object Store (objectstore). An objectstore can store any number of objects, limited only by the specific browser implementations of the IndexedDB specification.

    IndexedDB provides support for transactions including three modes: readonly, readwrite, and versionchange. The transaction mode affects concurrency. As its name implies, readonly provides read access to an objectstore. IndexedDB allows for multiple concurrent readonly transactions. The readwrite transaction mode allows for the reading, updating, and deleting of records, however only a single readwrite transaction can be open at a time. The last transaction mode, versionchange, allows for creation and deletion of object stores and indexes.

    An objectstore can have one or more indexes. Indexes are stored in specialized object stores and are automatically updated when a record is created/updated/deleted in the referenced (indexed) objectstore.

    IndexedDB also provides querying data asynchronously through the use of cursors. Similar to traditional database cursors, a cursor can be used to retrieve specific records based on the key or any indexed field. When opening a cursor (query), one or more values can be used to retrieve one or more records. For example, it is possible to retrieve all employees in specific states (regions) or departments.

     Web SQL Database

    The Web Applications Working Group, which is responsible for creating and maintaining specifications related to web applications, permanently put the Web SQL Database specification on hold because all web browsers that implemented the proposed specification used the same implementation (Sqlite). W3C requires more than one implementation in order for standardization.

    Of the major web browsers, only Google Chrome and Apple Safari support the Draft Web SQL Database Specification. Microsoft Internet Explorer and Mozilla Firefox do not support Web SQL Database. Therefore, due to limited cross-browser support and the on-hold status of the W3C specification, it is not recommended to use Web SQL going forward.

    Key Criteria for Selecting Data Storage Mechanism

    Choosing the right data storage mechanism is dependent on several key criteria. The important solution architecture decision of whether to use Web Storage or IndexedDB should only be made after careful consideration of the current and potential future needs for the website or business web application. The following are the most important criteria and related questions to ask:

    Data volume/size

    • Is the data a single value or a collection of records?

    • How large is each date item or collection?

    Permanency of data

    • Should the data only be persisted for the current session or should it be preserved for future sessions as well?

    • How often will the data be used? Is it for a single page or multiple pages?

    Structure and type of data

    • What is the data type for the data entity?

    • Is it a single value or a complex object (with multiple fields/attributes)

    How data will be accessed

    • How will the data be retrieved?
    • Is it necessary to query for specific records?
    • Should the data be accessed synchronously (loading data is quick) or asynchronously (loading data is slower due to data size)?

    Common Use Cases for Data Storage

    Based on the above key criteria, from a solution architecture perspective there are several scenarios and use cases where Web Storage is the best option and several use cases where IndexedDB is the better option.

    When to use Web Storage

     

    The following are scenarios where Web Storage is the best architectural solution:

    1. Need ability to store temporary data – Stateless web applications are generally more scalable than web applications that must maintain state. Session Storage enables information that traditionally would have required server side sessions to be stored locally. This includes information such as temporary preferences or selected options (i.e. sort or filter criteria) and the current user’s identity.

    2. Need ability to store limited data – Web Storage is limited to 5-10 MB and only supports synchronous data retrieval, so Web Storage is only appropriate for storing/retrieving a small amount of data.

    3. Need ability to store a single record – Web Storage only supports string values and isn’t ideal for storing more than one value as it would require converting an array to its string representation. Therefore, Web Storage is appropriate for storing form field values, such as submitting an application or opening a support case.

    4. Only need to retrieve data using unique keys – Web Storage only provides the ability to retrieve values of a data item based on a unique key. Web Storage is ideal for storing application/site settings or parameters, such as user preferences and user profile data.

    When to use IndexedDB

     

    The following are scenarios where IndexedDB is the best architectural solution:

    1. Need ability to store larger sets of data – IndexedDB does not have the same 5-10 MB size limitation that Web Storage does. Also since data access is asynchronous, large data sets can be loaded without impacting application responsiveness or user experience. Therefore it is ideal to store large data lists, such as caching dictionary data for selection lists.

    2. Need ability to store object data – IndexedDB provides support for objectstores containing multiple instances of a specific object, thus it is ideal to support offline record creation/modification and the offline viewing of a product catalog or customer data.

    3. Need ability to retrieve one or more records using different retrieval criteria – Unlike Web Storage, IndexedDB not only provides the ability to retrieve a single record based on its key, but also one or more records based on any indexed field. For example, IndexedDB provides ability to retrieve a single employee based on its Employee ID or multiple employees based on Department or State.

    Choosing the right data storage options

    Web Storage is great for simple data storage, and an obvious improvement over cookies, however different use case scenarios require a more robust storage mechanism. IndexedDB is better for complex data storage (rather than simple strings) and for storing larger data sets (hundreds or thousands of records). IndexedDB also provides indexing to support fast data retrieval of larger data sets and querying data using multiple criteria.

    However, it is important to remember what Bernard Baruch said, “If all you have is a hammer, everything looks like a nail”. This sentiment applies to solution architecture, as it does to most things. Web developers and architects need to determine which storage mechanism is optimal for the specific data being stored, rather than attempting to standardize on Web Storage or IndexedDB across a website. Developers should avoid overly complicating a website by only using IndexedDB because it is perceived as more robust or only using Web Storage because it is simpler. Either approach is likely to lead to more difficulties maintaining the website. It is better to understand the technical differences and use cases and then apply the right technology rather than standardize on a single storage mechanism.

    It is important to remember what Bernard Baruch said, “If all you have is a hammer, everything looks like a nail”. This sentiment applies to solution architecture, as it does to most things. Web developers and architects need to determine which storage mechanism is optimal for the specific data being stored, rather than attempting to standardize on Web Storage or IndexedDB across a website.

    Related Article(s)

    1. 10 Reasons Why HTML5 Matters to Businesses
    2. Understanding Microdata in HTML5

    Additional Sources

    1. World Wide Web Consortium (W3C) Web Storage Specification – World Wide Web Consortium (W3C) specification for Web Storage

    2. World Wide Web Consortium (W3C) Indexed Database API Specification – World Wide Web Consortium (W3C) specification for Indexed Database API

    3. World Wide Web Consortium (W3C) Web SQL Database Specification – World Wide Web Consortium (W3C) specification for Web SQL Database. This specification is permanently on hold and it is recommended that Web Storage and IndexedDB be used instead.

    4. Using the HTML5 IndexedDB API – Article written by author that explains how to use the IndexedDB API

    5. Using HTML5 database and offline capabilities, Part 1: Provide offline data editing and data synchronization – Part 1 of article series written by author that explains how to use the IndexedDB to support offline data editing

    6. Using HTML5 database and offline capabilities, Part 2: Leveraging the IndexedDB API in HTML5 – Part 2 of article series written by author that explains how to use the IndexedDB to support offline data editing