Note this requires Macports
It’s 3am and you’re hunting for that graph you clipped a couple of weeks ago. Report is due in for 10 the next morning. You swore up and down you’d never do this again, but here we are, can’t find the file or the reference.
If only you could search the text in the actual images themselves.
Well then, here we are, all you need is a
sudo port install tesseract and
execute this against the directory containing all your images. It’ll perform OCR
(character recognition) on all the text in each file and then give you PDFs
out the other end. These are fully searchable, and should appear in Spotlight
results as soon as they’ve been indexed.
#!/bin/bash set -f set -e if [ ! -f "$1" ] ; then echo "File $1 does not exist" ; exit 1 fi tesseract -l eng --psm 3 "$1" "$(date -r "$1" +"ss_%Y%m%d%H%M%S")" pdf trash -v -F "$1"