Lou Plummer published a useful workflow to send emails to Obsidian using IFTTT, Dropbox, and Hazel. I liked the idea of capturing and sending information to an Obsidian Vault via email. Since I use Obsidian on all my devices, there are situations where an email would be more efficient for capturing information and storing it in Obsidian. I have had a special email account for these kinds of things for years, but the emails end up in an inbox and not in Obsidian. Additionally, solving this problem presents an enjoyable challenge.
Here are my requirements:
- Runs locally on my main computer—in my case, my MacBook
- Works with a dedicated email account
- Accesses emails via IMAP
- Downloads all unread emails and converts them into Markdown files
- Saves these files directly into the Obsidian Vault
Because I’m not yet an expert in Python programming, I asked my Obsidian Co-Pilot using OpenAI GPT-4o for help. After three iterations, I have a version that works very well. In this post, I want to present this solution, although I know it is only a quick and dirty first attempt. I will continue to refine it and make it more sophisticated.
Initial Prototype Implementation
As I mentioned earlier, I use a special email account solely for this purpose, which serves as a basic security feature. I only use this account to send emails to myself, so I haven’t received any spam on it for years. However, any account that uses IMAP can be used.
Whenever I don’t explicitly request a specific language, OpenAI typically provides a Python-based solution. Let’s go through the steps:
Preparing the Environment
The first step is to create a project folder and navigate into it:
mkdir /Users/leif/Documents/Projects/mail2obsidian cd /Users/leif/Documents/Projects/mail2obsidian
Next, create and activate a virtual Python environment inside your project folder. This ensures that all the necessary libraries are local to this project, and all paths are handled within this environment. This way, your standard Python installation remains unchanged.
python3 -m venv env source env/bin/activate
After activating the environment, your shell prompt will change to something like this:
(env) > $
You’ll need a few packages to run the script:
pip install imapclient markdownify pyyaml
imapclient
is a library that enables the script to interact with email servers. It’s used to fetch all unread emails from your mail server via IMAP.markdownify
is a library that converts HTML content into Markdown format. It’s used to convert HTML emails into Markdown, though it’s not perfect.pyYAML
is a library for parsing and writing YAML. In this case, it allows you to keep all configuration parameters outside the code in a YAML file. This library can read and understand YAML.
Configuration
Next, we will create the configuration file config.yml
. You can use any text editor; I typically use the CLI editor nano
. Enter your data and path as follows:
email: server: "imap.example.com" user: "your-email@example.com" password: "your-password" output: path: "/Users/leif/Documents/ObsidianVault/98 emails"
Make sure to replace the placeholders with your actual email server details and the desired output path.
The Main Code
Finally, create a new file and copy-paste the following Python script. I named mine mail2obsidian.py
:
import imapclient import email from email.header import decode_header import os from markdownify import markdownify as md import yaml # Load configuration from YAML file with open('config.yaml', 'r') as file: config = yaml.safe_load(file) EMAIL = config['email']['user'] PASSWORD = config['email']['password'] IMAP_SERVER = config['email']['server'] OUTPUT_PATH = config['output']['path'] # Ensure the output directory exists os.makedirs(OUTPUT_PATH, exist_ok=True) # Connect to the server with imapclient.IMAPClient(IMAP_SERVER) as server: server.login(EMAIL, PASSWORD) server.select_folder('INBOX') # Search for all unread emails messages = server.search(['UNSEEN']) for uid, message_data in server.fetch(messages, 'RFC822').items(): email_message = email.message_from_bytes(message_data[b'RFC822']) # Decode email subject subject, encoding = decode_header(email_message['Subject'])[0] if isinstance(subject, bytes): subject = subject.decode(encoding if encoding else 'utf-8') # Create a safe filename filename = f"{subject}.md".replace('/', '_').replace('\\', '_') # Initialize email content email_content = f"# {subject}\n\n" # Extract email body for part in email_message.walk(): if part.get_content_type() == "text/plain" or part.get_content_type() == "text/html": charset = part.get_content_charset() or 'utf-8' payload = part.get_payload(decode=True) if payload: body = payload.decode(charset, errors='replace') if part.get_content_type() == "text/html": body = md(body) email_content += body # Save the email content to a Markdown file file_path = os.path.join(OUTPUT_PATH, filename) with open(file_path, 'w', encoding='utf-8') as f: f.write(email_content) # Mark the email as read server.add_flags(uid, '\\Seen') print("Unread emails have been converted to Markdown files.")
Here’s a brief explanation of what each part of the code does:
- Imports:
imapclient
: A library for interacting with email servers using the IMAP protocol.email
: A module for handling email messages.decode_header
: A function to decode email headers.os
: A module for interacting with the operating system, used here to handle file paths.markdownify
: A library to convert HTML content to Markdown.yaml
: A library to parse YAML configuration files.
- Configuration Loading:
- The script reads a configuration file named
config.yaml
to get the email credentials (EMAIL
,PASSWORD
), the IMAP server address (IMAP_SERVER
), and the output directory path (OUTPUT_PATH
).
- The script reads a configuration file named
- Ensure Output Directory Exists:
os.makedirs(OUTPUT_PATH, exist_ok=True)
: Ensures that the directory where the Markdown files will be saved exists. If it doesn’t exist, it will be created.
- Connect to the Email Server:
- The script uses
IMAPClient
to connect to the specified IMAP server and logs in using the provided credentials. - It selects the ‘INBOX’ folder to search for emails.
- The script uses
- Search and Fetch Unread Emails:
server.search(['UNSEEN'])
: Searches for all unread emails in the inbox.server.fetch(messages, 'RFC822')
: Fetches the full email data for each unread email.
- Process Each Email:
- For each email, it decodes the subject line using
decode_header
. - It creates a safe filename by replacing any slashes in the subject with underscores.
- Initializes a Markdown string with the email subject as a header.
- For each email, it decodes the subject line using
- Extract Email Body:
- The script iterates over each part of the email to find text or HTML content.
- It decodes the content using the appropriate charset.
- If the content is HTML, it converts it to Markdown using
markdownify
. - The content is appended to the Markdown string.
- Save Email as Markdown File:
- The email content is saved to a file in the specified output directory with a
.md
extension.
- The email content is saved to a file in the specified output directory with a
- Mark Email as Read:
server.add_flags(uid, '\\Seen')
: Marks the email as read on the server.
- Completion Message:
- Prints a message indicating that the unread emails have been converted to Markdown files.
If you configure everything correctly and have a few unread emails in your inbox, you can run this script (while still in the active Python environment) with:
(env) > $ python mail2obsidian.py
If everything runs smoothly, you’ll see a success message in the terminal, and all unread emails will be saved inside your Obsidian vault at the defined location. Otherwise, you might encounter some runtime errors.
Unread emails have been converted to Markdown files.
Outlook
This proposed solution is far from complete. Although my account isn’t receiving spam yet, it might be wise to ensure that not every email is automatically processed. For instance, requiring a specific string in the subject line could be a useful filter. Only emails containing this string would be downloaded.
This string could also indicate where to save the emails or serve as a tag. Different codes could direct the email to different folders or represent different tags.
It might also be beneficial to add frontmatter for the sender’s email address or other email properties before saving the file in the vault. In my vault, every note includes a footer area, so this could be added as well.
Currently, only plain text and HTML are being downloaded. This is relatively safe, as it reduces the risk of executable code being loaded. However, I welcome feedback on how secure this approach is.
I believe the conversion of HTML emails could be improved, so there’s still a need to explore this further.
My plan is to compile the script once it’s finished and then run it every 30 minutes using a scheduler like launchd
. For this, all outputs need to be redirected to a log, and error handling should be examined more closely.
Some people have asked for an Obsidian plugin to handle this, but I’m not sure a plugin is always necessary. The great thing about Obsidian is that all notes are Markdown files, which can also be created with scripts or applications like this one. Not everything needs to be a plugin, which could potentially slow down Obsidian. Scripts like these only consume resources when they are running. I can use standard schedulers like launchd
or cron
to run the script periodically. It fetches emails even when Obsidian is not running, making it a perfect way to use scripts outside of Obsidian to create workflows for Obsidian.
I hope this inspires you to come up with your own ideas. It would be great if you could share them here in the comments or leave a link.
Leave a Reply