Using AWS Rekognition to identify race participants
Besides being techy, I dabble in the occasional run at 5k & 10k distances. In many races, photographers capture you, usually in some very un-flattering pose. The problem is that there can be hundreds of photos to check through - sometimes thousands.
Solution - AWS Rekognition
Recently, AWS announced several AI services around image processing. Rekognition was one of the services, and I’d been looking for a realistic test other than identifying hotdogs.
As an API-based service, Rekognition doesn’t need lambda or EC2 instances, although of course they can take advantage of the service. Given an image (and more recently) videos, AWS Rekognition can solve this in several ways:
- Identify objects and text within an image
- Identify face - identifying celebrities or comparing against known faces.
- Facial analysis - identify gender, emotion, gender, and features.
I put together some code which will take the photos from a race and check if any managed to include me (or anyone with a specific race number).
But first, a quick example of how to use Rekognition.
Basic example
I’m tending to use Python a lot for development these days, and luckily there is an AWS SDK called boto3, which gives access to the various API services available from AWS. Similar SDKs are available for Node, Java, and Javascript. Boto has interfaces to virtually all of the AWS services; if you have it already installed and can’t find an interface for Rekognition, make sure you’re running version 2.7 or greater.
Using Rekognition with boto is extremely simple; configure authentication as described in the documentation (or use my virtualaws tool that builds on virtualenv); create a session to connect to AWS and then make a call to the required service - a simple example might look like:
1#! /usr/bin/env python
2
3import boto3
4import json
5import urllib3
6
7imageUrl = 'https://<image url>.jpg'
8
9# - Setup Connection
10client = boto3.client('rekognition')
11
12# - Retrieve image data -------------------------------------------------------
13urllib3.disable_warnings()
14http = urllib3.PoolManager()
15
16img = http.request('GET', imageUrl, retries = False)
17
18rekog = client.detect_text(Image={'Bytes':img.data})
19
20print json.dumps(rekog['TextDetections'], indent = 2, sort_keys = True)
The above code will then output the following:
1 "Confidence": 98.27688598632812,
2 "DetectedText": "175",
3 "Geometry": {
4 "BoundingBox": {
5 "Height": 0.18716847896575928,
6 "Left": 0.480046808719635,
7 "Top": 0.2529766857624054,
8 "Width": 0.1664421707391739
9 },
10 "Polygon": [
11 {
12 "X": 0.47943100333213806,
13 "Y": 0.25560006499290466
14 },
15 {
16 "X": 0.6463190913200378,
17 "Y": 0.253519743680954
18 },
19 {
20 "X": 0.6468464732170105,
21 "Y": 0.4341394007205963
22 },
23 {
24 "X": 0.4799584150314331,
25 "Y": 0.436219722032547
26 }
27 ]
28 },
29 "Id": 2,
30 "ParentId": 0,
31 "Type": "WORD"
32 }
The JSON shows that Rekognition recognises the vest number even with the glare over the image. The data also provides a confidence rating and data about where the text exists within the picture; the detect_text call and the response returned can be found in the boto documentation.
Processing race images
The example above shows how simple it can be to identify a vest number from an image, but how to process 100’s of photos from a race day? Providing you have a list of image URLs, the code will identify the race numbers in the images:
1#! /usr/bin/env python
2
3import boto3
4import urllib3
5import json
6import sys
7
8
9# - Setup Libraries -----------------------------------------------------------
10client = boto3.client('rekognition')
11
12urllib3.disable_warnings()
13http = urllib3.PoolManager()
14
15
16# - Read list of URLs ---------------------------------------------------------
17try:
18 text_file = open('url_list.txt', 'r')
19 urls = text_file.readlines()
20except:#
21 print "Can't open list of urls"
22 exit
23
24urls = [x.strip() for x in urls]
25
26# - Retrieve images using supplied urls ---------------------------------------
27vestNumbers = {}
28
29print "Retrieving images"
30for url in urls :
31 try:
32 r = http.request('GET', url, retries=False)
33 except:
34 print "Image retrieval failed - HTTP status: %d" % r.status
35
36 if r.status == 200: # Check we've managed to download some data
37
38 # - Call the rekognition api with the image data
39 rekog = client.detect_text(Image={'Bytes':r.data})
40
41 for word in rekog['TextDetections']:
42 if word['Type'] == 'WORD' and word['DetectedText'].isdigit() :
43 vest = word['DetectedText']
44 print "Detected Number: %s" % vest
45
46 if vest not in vestNumbers:
47 vestNumbers[vest] = []
48
49 vestInfo = {}
50 vestInfo['url'] = url
51 vestInfo['confidence'] = word['Confidence']
52
53 vestNumbers[vest].append(vestInfo)
54
55with open('vestInfo.json', 'w') as jsonFile:
56 json.dump(vestNumbers, jsonFile, indent =2, sort_keys = True)
The code above will output JSON listing the vest numbers found. The data returned also contains a list of matching images and a confidence level for each image.
{ "0": [ { "confidence": 92.21312345, "url": "https://<exampleurl>.jpg" }, ], ... "175": [ { "confidence": 92.21312345, "url": "https://<exampleurl>.jpg" }, { "confidence": 84.98798787, "url": "https://<exampleurl>.jpg" } ] }
Pricing
For the first year of your account, you’re entitled to what is known as Free Tier which allows you to evaluate AWS services. During that first year, you’ll be allowed to analyse 5000 images for free each month.
According to the AWS pricing page, in the London AWS region, it currently costs $1.16 per 1000 images processed.
Updated 26/04: Added pricing info