How to automate summaries for new files with OpenAI API and Python
Yesterday, I implemented my first AI Automation. Yeah! Nothing fancy! I just wanted some code that runs and uses an LLM API.
I asked o4-mini
for a
"simple working example of AI automation." It returned an
auto_summarize.py
script
that watches the
incoming
directory for new
.txt
files. When it finds
one, it summarizes the contents and writes the result to the
outgoing
directory using
the OpenAI API.
It almost worked, but the script used an old version of
openai
library which led
to the following error:
You tried to access openai.ChatCompletion, but this is no longer
supported in openai>=1.0.0 - see the README at
https://github.com/openai/openai-python for the API. a
You can run `openai migrate` to automatically upgrade your codebase to
use the 1.0.0 interface.
I knew how to fix it, but I tried
openai migrate
command out
of curiosity. This almost worked.
I wanted to manage the project and dependencies with
uv
, but
o4-mini
didn't know
about it. So I switched to
gpt-4.1
, which gave a
satisfying answer. In the end, I followed the official
uv docs
directly.
Basic script
After tweaking the code, here's the solution I ended up with.
It uses
gorakhargosh/watchdog, specifically the
Observer
class, to
monitor the
ingoing
directory. We
create a
TextFileSummaryHandler
class that inherits from
FileSystemEventHandler
and overrides
on_created
to summarize
new .txt
files.
auto_summarize.py
import os
import time
from openai import OpenAI
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
IN_DIR = "incoming"
OUT_DIR = "outgoing"
class TextFileSummaryHandler(FileSystemEventHandler):
"""Generate summaries for newly created `.txt` files.
Use the OpenAI API to create a summary, saving the output
in the `OUT_DIR` directory using the original filename
with `.txt` replaced by `_summary.txt`."""
def __init__(self, client):
super().__init__()
self.client = client
def summary(self, text):
resp = self.client.chat.completions.create(
model="gpt-4.1",
messages=[
{"role": "system", "content": "You are a helpful assistant that summarizes text."},
{"role": "user", "content": text}
],
temperature=0.3,
max_completion_tokens=200)
return resp.choices[0].message.content
def on_created(self, event):
if event.is_directory or not event.src_path.endswith(".txt"):
return
filepath = event.src_path
filename = os.path.basename(filepath)
print(f"Detected {filename}, summarizing…")
with open(filepath, "r", encoding="utf-8") as f:
text = f.read()
summary = self.summary(text)
out_path = os.path.join(OUT_DIR, filename.replace(".txt", "_summary.txt"))
with open(out_path, "w", encoding="utf-8") as f:
f.write(summary)
print(f"Summary written to {out_path}")
if __name__ == "__main__":
os.makedirs(IN_DIR, exist_ok=True)
os.makedirs(OUT_DIR, exist_ok=True)
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
event_handler = TextFileSummaryHandler(client)
observer = Observer()
observer.schedule(event_handler, IN_DIR, recursive=False)
observer.start()
print(f"Watching '{IN_DIR}/' for new .txt files. Press Ctrl+C to stop.")
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
pass
finally:
observer.stop()
observer.join()
Set up the project and start monitoring
To try it, first export the OpenAI API key like this:
$ export OPENAI_API_KEY='<your_openai_api_key>'
Then, to set up the project and start monitoring, in the
directory containing
auto_summarize.py
script, run:
$ uv init
$ uv add openai watchdog
$ uv run auto_summarize.py
Watching 'incoming/' for new .txt files. Press Ctrl+C to stop.
In another terminal, create the
incoming/foo.txt
file
like this:
$ echo 'The sky is red. The sea is green.' > incoming/foo.txt
Back in the first terminal, you'll see:
Watching 'incoming/' for new .txt files. Press Ctrl+C to stop.
Detected foo.txt, summarizing…
Summary written to outgoing/foo_summary.txt
Finally, the
outgoing/foo_summary.txt
file might contain:
The sky is described as red and the sea as green.
Processing Files in Parallel
Well.
I immediately thought: are the newly created files actually
processed in parallel? The great
gpt-4.1
will tell you
no:
No, by default, the
FileSystemEventHandler.on_created
method inwatchdog
is not called in parallel. All event handling is done sequentially, in the same observer thread.Details:
watchdog
uses a single background thread (the Observer) to monitor filesystem events.When files are created, the observer invokes the handler's methods (
on_created
,on_modified
, etc.) one after another, not concurrently.If you drop many files quickly, the events are processed in order. If
on_created
takes a long time (e.g., calling a slow AI API), later events have to wait.
This is true. But how can we actually check this claim?
We could look at the source code of the
watchdog
package.
Or, we could just drop a bunch of files into the
incoming
directory and
see how long it takes for them all to be processed. If the
handling is sequential, the total time should be the number of
files times the OpenAI API call latency.
If you want to try it yourself, run:
for i in {1..5};do echo 'foo bar baz' > incoming/foo-$i.txt; done
If you'd rather not use up your API credits, you can replace this line:
summary = self.summary(text)
with these lines:
time.sleep(2)
summary = "The summary."
Do this before creating those few files.
So, how can we handle these events in parallel. Well, we can use
ThreadPoolExecutor
class
like this:
import os
import time
from openai import OpenAI
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
from concurrent.futures import ThreadPoolExecutor
IN_DIR = "incoming"
OUT_DIR = "outgoing"
class TextFileSummaryHandler(FileSystemEventHandler):
"""Generate summaries for newly created `.txt` files.
Use the OpenAI API to create a summary, saving the output
in the `OUT_DIR` directory using the original filename
with `.txt` replaced by `_summary.txt`."""
def __init__(self, client, executor):
super().__init__()
self.client = client
self.executor = executor
def summary(self, text):
resp = self.client.chat.completions.create(
model="gpt-4.1",
messages=[
{"role": "system", "content": "You are a helpful assistant that summarizes text."},
{"role": "user", "content": text}
],
temperature=0.3,
max_completion_tokens=200)
return resp.choices[0].message.content
def process_file(self, filepath):
filename = os.path.basename(filepath)
print(f"Detected {filename}, summarizing…")
with open(filepath, "r", encoding="utf-8") as f:
text = f.read()
summary = self.summary(text)
out_path = os.path.join(OUT_DIR, filename.replace(".txt", "_summary.txt"))
with open(out_path, "w", encoding="utf-8") as f:
f.write(summary)
print(f"Summary written to {out_path}")
def on_created(self, event):
if event.is_directory or not event.src_path.endswith(".txt"):
return
self.executor.submit(self.process_file, event.src_path)
if __name__ == "__main__":
os.makedirs(IN_DIR, exist_ok=True)
os.makedirs(OUT_DIR, exist_ok=True)
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
executor = ThreadPoolExecutor(max_workers=4)
event_handler = TextFileSummaryHandler(client, executor)
observer = Observer()
observer.schedule(event_handler, IN_DIR, recursive=False)
observer.start()
print(f"Watching '{IN_DIR}/' for new .txt files. Press Ctrl+C to stop.")
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
pass
finally:
observer.stop()
observer.join()
executor.shutdown()
That's all I have for today! Talk to you soon ;)
Built with one.el.