Optimizing Automation Scripts: How We Reduced Execution Time from 1 Hour to 30 Seconds, a 140x Improvement

TL;DR
We optimized our Pub/Sub automation script, reducing execution time from over 1 hour to less than 30 seconds using Node.js. Here’s a quick overview of the improvements:
Initial Python Script: Sequential processing, took over 1 hour.
Improved Python Script: Asynchronous processing, reduced time to 5 minutes.
Node.js Script: Further optimized, reduced time to less than 30 seconds.

From Over 1 Hour to Less Than 30 Seconds: Optimizing Pub/Sub Automation
In our quest to enhance performance, we transformed our Pub/Sub automation script from a slow, sequential process to a lightning-fast, asynchronous one. Here’s how we did it.
Initial Python Script
Our initial script was simple but slow, taking over 1 hour to create Pub/Sub topics and subscriptions sequentially.

from google.cloud import pubsub_v1
import mysql.connector
def create_topic(project_id, topic_name):
publisher = pubsub_v1.PublisherClient()
publisher.create_topic(request={"name": publisher.topic_path(project_id, topic_name)})
def create_subscription(project_id, topic_name, subscription_name, filter_expression):
subscriber = pubsub_v1.SubscriberClient()
subscriber.create_subscription(request={
"name": subscriber.subscription_path(project_id, subscription_name),
"topic": subscriber.topic_path(project_id, topic_name),
"filter": filter_expression,
})
def fetch_data():
conn = mysql.connector.connect(host="", port="3306", user="", password="", database="ecms")
cursor = conn.cursor()
cursor.execute("SELECT name FROM your_table")
results = cursor.fetchall()
conn.close()
return results
def create_pubsub_resources(project_id, name):
topic_name = f"topic-{name}"
subscription_name = f"subscription-{name}"
filter_expression = f'attributes.name = "{name}"'
create_topic(project_id, topic_name)
create_subscription(project_id, topic_name, subscription_name, filter_expression)
if __name__ == "__main__":
project_id = "your-project-id"
data = fetch_data()
for (name,) in data:
create_pubsub_resources(project_id, name)
Key Details:
Sequential Processing: Each Pub/Sub resource is created one after the other.
Execution Time: Over 1 hour due to the sequential nature and blocking I/O operations.
Improved Python Script
By introducing asynchronous processing, we reduced the execution time to 5 minutes.

import asyncio
from google.cloud import pubsub_v1
import mysql.connector
async def create_topic(publisher, project_id, topic_name):
await asyncio.to_thread(publisher.create_topic, request={"name": publisher.topic_path(project_id, topic_name)})
async def create_subscription(subscriber, project_id, topic_name, subscription_name, filter_expression):
await asyncio.to_thread(subscriber.create_subscription, request={
"name": subscriber.subscription_path(project_id, subscription_name),
"topic": subscriber.topic_path(project_id, topic_name),
"filter": filter_expression,
})
def fetch_data():
conn = mysql.connector.connect(host="", port="3306", user="", password="", database="ecms")
cursor = conn.cursor()
cursor.execute("SELECT name FROM your_table")
results = cursor.fetchall()
conn.close()
return results
async def create_pubsub_resources(project_id, name):
publisher = pubsub_v1.PublisherClient()
subscriber = pubsub_v1.SubscriberClient()
topic_name = f"topic-{name}"
subscription_name = f"subscription-{name}"
filter_expression = f'attributes.name = "{name}"'
await create_topic(publisher, project_id, topic_name)
await create_subscription(subscriber, project_id, topic_name, subscription_name, filter_expression)
async def main():
project_id = "your-project-id"
data = fetch_data()
tasks = [create_pubsub_resources(project_id, name) for (name,) in data]
await asyncio.gather(*tasks)
if __name__ == "__main__":
asyncio.run(main())
Key Details:
Asynchronous Processing: Uses
asyncioto run tasks concurrently.Execution Time: Reduced to 5 minutes by overlapping I/O operations.
Node.js Script
Finally, rewriting the script in Node.js reduced the execution time to less than 30 seconds.

const {PubSub} = require('@google-cloud/pubsub');
const mysql = require('mysql2');
const {promisify} = require('util');
async function createTopic(pubSubClient, topicName) {
const topic = pubSubClient.topic(topicName);
await topic.create();
}
async function createSubscription(pubSubClient, topicName, subscriptionName, filterExpression) {
const topic = pubSubClient.topic(topicName);
const subscription = topic.subscription(subscriptionName);
await subscription.create({filter: filterExpression});
}
async function fetchData() {
const connection = mysql.createConnection({host: "", port: 3306, user: "", password: "", database: "ecms"});
const query = "SELECT name FROM your_table";
const rows = await promisify(connection.query).bind(connection)(query);
connection.end();
return rows;
}
async function createPubSubResources(pubSubClient, name) {
const topicName = `topic-${name}`;
const subscriptionName = `subscription-${name}`;
const filterExpression = `attributes.name = "${name}"`;
await createTopic(pubSubClient, topicName);
await createSubscription(pubSubClient, topicName, subscriptionName, filterExpression);
}
async function main() {
const projectId = "your-project-id";
const pubSubClient = new PubSub({projectId});
const data = await fetchData();
const tasks = data.map(({name}) =>
createPubSubResources(pubSubClient, name)
);
await Promise.all(tasks);
}
main().catch(console.error);
Key Details:
Non-Blocking I/O: Node.js handles I/O operations asynchronously, making it highly efficient for this task.
Execution Time: Reduced to less than 30 seconds due to the efficient handling of concurrent operations.
Conclusion
By leveraging asynchronous programming and Node.js, we achieved a significant performance boost, reducing the execution time from over 1 hour to less than 30 seconds. This journey underscores the importance of optimizing code for better efficiency and scalability.