Day 4 - Deploying a Flask Web App on Cloud Run and Implementing Traffic Splitting on Google Cloud

Day 4 - Deploying a Flask Web App on Cloud Run and Implementing Traffic Splitting on Google Cloud

Cloud Run is a fantastic platform for deploying and scaling containerized applications. One of its features is the ability to split traffic between different versions of your deployed application.
You can perform A/B testing, canary releases, and gradual rollouts with minimal effort.
In this article, I'll walk you through the process of deploying a Flask web app on Cloud Run using gcloud CLI and demonstrate how to effectively split traffic between the different versions.
The code for this article can be found on this GitHub repo on the Day_4 folder.

Prerequisites:

  1. Python 3.x

  2. Google Cloud Project with the Cloud Build Service Account role granted to the Compute Engine default service account

  3. gcloud CLI installed locally on your computer (Instructions)

Setting up:

  1. Create a Google Cloud project and verify that Billing has been enabled.

  2. Download, install and configure gcloud CLI on your computer (instructions)

  3. Create a folder you’ll be working in and create & activate a Python virtual environment we will use to install dependencies for our project:

     $ python -m venv venv
    
     # Activate the virtual environment
     $ venv\Scripts\activate # For Windows
    
     $ source venv/bin/activate # For MacOS/UNIX
    
  4. Create a requirements.txt file and add the following dependencies

     Flask==3.0.3
     gunicorn==22.0.0
     Werkzeug==3.0.3
     requests==2.32.3
    
  5. Install the dependencies in your activated virtual environment:

     (venv)$ pip install -r requirements.txt
    
  6. Create a file called main.py and paste the following code:

     import requests
     from flask import Flask, render_template
    
     app = Flask(__name__)
    
     @app.route("/")
     def get_random_quote():
         """Fetch a random quote from the API"""
         response = requests.get('https://dummyjson.com/quotes/random')
         quote = response.json()
         return render_template('index.html', quote=quote)
    
     if __name__ == "__main__":
         app.run(debug=True, host="0.0.0.0", port=8080)
    
  7. Create a folder named templates and inside it, create a file called index.html:

     <!DOCTYPE html>
     <html lang="en">
     <head>
         <meta charset="UTF-8">
         <meta name="viewport" content="width=device-width, initial-scale=1.0">
         <title>Flask Demo Random Quote</title>
         <style>
             .theroot{
                 height: 100vh;
                 width: 100%;
                 display: flex;
                 justify-content: center;
                 align-items: center;
             }
             .container {
                 text-align: center;
             }
         </style>
     </head>
     <body>
         <div class="theroot">
             <div class="container">
                 <h2><b>Random Quote Generator</b></h2>
                 <p>✍️ <b>Author</b>: {{quote.author}}</p>
                 <p>📃 <b>Quote</b>: {{quote.quote}}</p>
             </div>
         </div>
     </body>
     </html>
    

    Your structure should look something like this:

     your_project_folder/
         |
         |- templates/
         |     |- index.html
         |
         |- main.py
    

    Script explanation:

    The script above imports and uses requests and to fetch data from the dummyjson API that provides random quotes.

    We are also using Flask, a web framework for building web applications. This will provide us a structure for handling incoming requests, route them appropriately and render the context to HTML templates

    app = Flask(__name__) creates an instance of our Flask application. The __name__ argument refers to the name of the current Python module

    @app.route(“/”) defines the route for our root URL of our web app.

    get_random_quote() this method is responsible for retrieving a random quote from the dummyjson API and Flask uses the render_template method to render the quote in the index.html template.

    The app starts the development server on port 8080 when the app.run(...) is executed.

  8. Run the project locally and visit http://127.0.0.1:8080 on your browser

     (venv)$ python main.py
    

    Results on the browser 👇

  9. Deploy the app on Google Cloud Run:

    In your terminal/command prompt, verify that you’re in the same location as the main.py file.

    Run the following command to start deploying your app to Cloud Run

     (venv)$ gcloud run deploy --source .
    

    You will be asked to give a name for your service. You can give it a name like random-quote-generator

    You’ll then be asked to specify numeric choice/text for the region for deployment, pick one i.e us-east1

    When prompted whether to allow unauthenticated invocations, choose y and hit Enter.

    gcloud will start bundling up the app, building it into a container and deploying it to cloud run.

    Once it’s done, a message will be printed on the terminal indicating the app has been deployed and its URL also presented there.

    Copy the Service URL and paste it into the browser to verify that the app is working as it did locally

  10. Navigate to the Cloud Run page on the Google Cloud console and verify that your app is present

  11. Deploying the second version of your app

    Go back to your code editor locally and do the following:
    In your project folder, create a folder called static, inside it create a folder called images. Download a sample image you can use and place it in this images folder.
    Go to your index.html and modify it to reference the image you’ve added:

    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>Flask Demo Random Quote</title>
        <style>
            .theroot{
                height: 100vh;
                width: 100%;
                display: flex;
                justify-content: center;
                align-items: center;
            }
            .container {
                text-align: center;
            }
        </style>
    </head>
    <body>
        <div class="theroot">
            <div class="container">
                <!-- Start of modification -->
                <img src="{{ url_for('static', filename='images/YOUR-IMAGE-NAME-HERE.jpg') }}" style="width:150px;height:auto"/>
                <h2><b>Random Quote Generator: Version 2</b></h2>
                <!-- End of modification -->
                <p>✍️ <b>Author</b>: {{quote.author}}</p>
                <p>📃 <b>Quote</b>: {{quote.quote}}</p>
            </div>
        </div>
    </body>
    </html>
    

    Your new directory structure should now look something like this:

    your_project_folder/
        |
        |- static/
        |    |- images/
        |          |- YOUR-IMAGE-NAME.jpg
        |
        |- templates/
        |     |- index.html
        |
        |- main.py
    

    Go back to your terminal/command prompt and deploy the app again. Make sure you provide the same service name and region you gave in the previous deployment. See step 9 for the guide ☝️

  12. Verify that the second version has been deployed by visiting the Service URL:

  13. Go back to the Cloud Run page on Google Cloud Console and verify that there are two versions present:
    Click on the service name i.e random-quote-generator and you’ll be taken to the details page.

    Click on the REVISIONS tab to view the deployed revisions of your app.

  14. Split the traffic:

    You will notice that the latest deployed revision has 100% of the traffic directed to it.

    We can split traffic between the two revision by clicking on MANAGE TRAFFIC as seen above in the image.

    You’ll see two input fields: Revision 1 and Revision 2.

    Revision 1 will be auto-populated with the Latest healthy revision with 100% traffic assigned to it.

    Fill in the Revision 2 field by selecting the fist version of the deployment.

    Distribute traffic: You can split traffic between the two revisions by assigning percentages among the two. For instance, I split traffic 50-50 between the two:

    Once you’re done, click on SAVE and visit the deployment URL and refresh a number of times to see the two versions of your app being returned on your browser.

Conclusion

In this article, we explored the process of deploying a simple Flask web application on Google Cloud Run. We demonstrated how to leverage Cloud Run's traffic splitting feature to safely introduce changes and experiment with different versions of our application.

Cloud Run, with its serverless architecture and ease of use, provides a powerful platform for deploying and scaling modern web applications. I encourage you to experiment with traffic splitting and explore the many other benefits of the Cloud Run platform.