Since 1999

 

3 minutes estimated reading time.

Case Study: Complex Legacy Google Datastore Conversion to PostgreSQL on AWS

Successful conversion from Google Datastore NoSQL to PostgreSQL on AWS!

Client Background

Our client is a State government agency operating a variety of custom web applications used by staff and elected officials for critical business functions. These applications, developed over the years by various contractors, serve as vital tools for the agency’s operations.

Client’s Tech Stack

The client’s web application was initially built on Google Cloud technologies, including the Google Datastore NoSQL database, by a previous contractor around 2012. The choice of Google Datastore was likely due to its popularity at the time. However, when Google sunset the stack the application was built on, a conversion became mandatory. Given that the client had other applications in the AWS cloud and preferred to consolidate their cloud environments, a switch to PostgreSQL on AWS was deemed the best course of action.

Challenges

The transition from Google Datastore to PostgreSQL presented several challenges:

  1. Issues with Google Cloud tools and the old database: The GUI Datastore tooling was difficult to use for data exploration, and the extract tools for Firebase were unable to read the database. Additionally, the older Datastore extract tools were incompatible with the new cloud runners.
  2. Field Mapping challenges: The NoSQL nature of the data made field mapping to a relational database a challenge, especially since the Google Datastore foreign key scheme is not directly compatible with a SQL database.
  3. Local development / data access issues: The database was large and could not be loaded into the locally available data store emulator, even with ample storage and RAM resources.

The Rietta Solution

Despite these challenges, Rietta successfully engineered an end-to-end extract, transform, and load process using open source technology, Docker, and our expertise. This process converted the data from the old Datastore database on the Google Cloud to a standard PostgreSQL database hosted on the client’s AWS cloud. The steps involved:

  • Extracting a full backup of the Datastore database from the Google Cloud to the ETL system
  • Transforming all of the records in all of the tables to JSON
  • Transforming the JSON records into the SQL database and the schema developed by the application developers for the updated web application
  • Loading the PostgreSQL database into AWS

This significant development effort, which other contractors were unable to undertake, was completed on time, allowing the client to successfully migrate their data and consolidate their applications in the AWS cloud.

Our solution was not only technically sound but also efficient and cost-effective. We used a fully scripted, Docker-based process, which made the conversion day much smoother. The client was able to provision an extremely performant Linux ETL server on AWS, which we set up in short order by simply SSHing the tooling and running it. This approach allowed us to avoid any unnecessary delays or complications during the maintenance window where by necessity the application was not available for use. The client was able to only pay for this expensive compute for the time they needed it for the conversion itself after which it was stopped and then later terminated to save money.

Moreover, because of our thorough testing and validation processes, the client was confident in the success of the conversion. This confidence proved to be well placed as the project was a success and the newly updated application has been in production serving the public since.