Backfill scripts involve boring but important tasks like
- adding a new column or a field for every rows in a database which has millions of entries
- Running inference on Images/ videos / entities because your intern messed up and proper scores are not being generated since Diwali weekend.
- You want to impress your manager for reasons unknown
But Running backfill scripts can be a headache , here’s why
- Rate limits / or max visibility timeouts for queues : More often than not the micro-service that you’ll be using wont let you do your task correctly . And very fairly , rate limits exist for a reason and your backfilling task shouldn’t interfere with requests from actual users.
- Error handling: In cases where running a backfill scripts will take days if not weeks, it becomes important to track failures and fire off the script from exactly where it stopped. To handle this make sure you are logging the failed Id’s in an error. txt file for the least
- You need to keep checking in every 4-5 hours in your
tmux
or VM if the script is running properly or not. Setting alerts in your calendar is the way to go . One added benefit of this is , when your colleagues see your filled calendar , they will be impressed by your meticulous nature.
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from datetime import datetime
import os
import torch.nn.functional as F
import matplotlib.pyplot as plt
from torchvision.utils import save_image
from torchvision.transforms import functional as TF
import random
class SimpleCNN(nn.Module):
def __init__(self):
super(SimpleCNN, self).__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=3)
self.conv2 = nn.Conv2d(10, 20, kernel_size=3)
self.fc1 = nn.Linear(20 * 5 * 5, 128)
self.fc2 = nn.Linear(128, 10)
self.pool = nn.MaxPool2d(2)
self.relu = nn.ReLU()
def forward(self, x):
x = self.pool(self.relu(self.conv1(x)))
x = self.pool(self.relu(self.conv2(x)))
x = x.view(-1, 20 * 5 * 5)
x = self.relu(self.fc1(x))
x = self.fc2(x)
return x