I couldn’t find all that much information about IMAP on the web, other than the RFC3501.
The IMAP protocol document is absoutely key to understanding the commands available, but let me skip attempting to explain and just lead by example where I can point out the common gotchas I ran into.
Logging in to the inbox
import imaplib mail = imaplib.IMAP4_SSL('imap.gmail.com') mail.login('myusername@gmail.com', 'mypassword') mail.list() # Out: list of "folders" aka labels in gmail. mail.select("inbox") # connect to inbox.
Getting all mail and fetching the latest
Let’s start by searching our inbox for all mail with the search function.
Use the built in keyword “ALL” to get all results (documented in RFC3501).
We’re going to extract the data we need from the response, then fetch the mail via the ID we just received.
result, data = mail.search(None, "ALL") ids = data[0] # data is a list. id_list = ids.split() # ids is a space separated string latest_email_id = id_list[-1] # get the latest result, data = mail.fetch(latest_email_id, "(RFC822)") # fetch the email body (RFC822) for the given ID raw_email = data[0][1] # here's the body, which is raw text of the whole email # including headers and alternate payloads
Using UIDs instead of volatile sequential ids
The imap search function returns a sequential id, meaning id 5 is the 5th email in your inbox.
That means if a user deletes email 10, all emails above email 10 are now pointing to the wrong email.
This is unacceptable.
Luckily we can ask the imap server to return a UID (unique id) instead.
The way this works is pretty simple: use the uid function, and pass in the string of the command in as the first argument. The rest behaves exactly the same.
result, data = mail.uid('search', None, "ALL") # search and return uids instead latest_email_uid = data[0].split()[-1] result, data = mail.uid('fetch', latest_email_uid, '(RFC822)') raw_email = data[0][1]
Parsing Raw Emails
Emails pretty much look like gibberish. Luckily we have a python library for dealing with emails called… email.
It can convert raw emails into the familiar EmailMessage object.
import email email_message = email.message_from_string(raw_email) print email_message['To'] print email.utils.parseaddr(email_message['From']) # for parsing "Yuji Tomita" <yuji@grovemade.com> print email_message.items() # print all headers # note that if you want to get text content (body) and the email contains # multiple payloads (plaintext/ html), you must parse each message separately. # use something like the following: (taken from a stackoverflow post) def get_first_text_block(self, email_message_instance): maintype = email_message_instance.get_content_maintype() if maintype == 'multipart': for part in email_message_instance.get_payload(): if part.get_content_maintype() == 'text': return part.get_payload() elif maintype == 'text': return email_message_instance.get_payload()
Advanced searches
We’ve only done the basic search for “ALL”.
Let’s try something else such as a combination of searches we want and don’t want.
All available search parameters are listed in the IMAP protocol documentation and you will definitely want to check out the SEARCH Command reference.
Here are just a few searches to get you started.
Search any header
For searching any headers, such as the subject, Reply-To, Received, etc., the command is simply “(HEADER “”)”
mail.uid('search', None, '(HEADER Subject "My Search Term")') mail.uid('search', None, '(HEADER Received "localhost")')
Search for emails since in the past day
Often times the inbox is too large and IMAP doesn’t specify a way of limiting results, resulting in extremely slow searches. One way to limit is to use the SENTSINCE keyword.
The SENTSINCE date format is DD-Jun-YYYY. In python, that would be strftime(‘%d-%b-%Y’).
import datetime date = (datetime.date.today() - datetime.timedelta(1)).strftime("%d-%b-%Y") result, data = mail.uid('search', None, '(SENTSINCE {date})'.format(date=date))
Limit by date, search for a subject, and exclude a sender
date = (datetime.date.today() - datetime.timedelta(1)).strftime("%d-%b-%Y") result, data = mail.uid('search', None, '(SENTSINCE {date} HEADER Subject "My Subject" NOT FROM "yuji@grovemade.com")'.format(date=date))
Fetches
Get Gmail thread ID
Fetches can include the entire email body, or any combination of results such as email flags (seen/unseen) or gmail specific IDs such as thread ids.
result, data = mail.uid('fetch', uid, '(X-GM-THRID X-GM-MSGID)')
Get a header key only
result, data = mail.uid('fetch', uid, '(BODY[HEADER.FIELDS (DATE SUBJECT)]])')
Fetch multiple
You can fetch multiple emails at once. I found through experimentation that it’s expecting comma delimited input.
result, data = mail.uid('fetch', '1938,2398,2487', '(X-GM-THRID X-GM-MSGID)')
Use a regex to parse fetch results
The returned result isn’t very easy to swallow. They are space separated key-value pairs.
Use a simple regex to get the data you need.
import re result, data = mail.uid('fetch', uid, '(X-GM-THRID X-GM-MSGID)') re.search('X-GM-THRID (?P<X-GM-THRID>\d+) X-GM-MSGID (?P<X-GM-MSGID>\d+)', data[0]).groupdict() # this becomes an organizational lifesaver once you have many results returned.
Conclusion
Well, that should leave you with a much better understanding of the IMAP protocol and using python to interface with Gmail.
Cerntainly more than I knew!
Thanks Yuji
This is soooo helpful!
Hi, is there a way to also use time with date in search?
All references to “time” in RFC 3501 say “disregarding time and timezone” – so I’d say no.
Very helpful; thanks lots.
short, concise and very helpful, many thanks.
Is there a way in gmail to know under how many labels (or directories) an email is ?
I would like to know with which labels my emails have been tagged.
Thanks. Very useful
Perfect – Many Thanks! Why can’t everyone write examples as straight forward as this? 🙂
Thanks for the detailed examples. Do you happen to have some code that wraps imaplib in a django app and parses fetched results as djang EmailMessage objects? I’m just looking for a starting point, not a polished app. Thanks again!
Now that’s what I call useful stuff! 😉
PS: it would have been nice(r) to have quick example on how to fetch “unread” emails.
for hotmail live and outlook user. if u want to use IMAP4. Use outlook python library, download here : https://github.com/awangga/outlook to retrieve unread email from your inbox :
import outlook
mail = outlook.Outlook()
mail.login(’emailaccount@live.com’,’yourpassword’)
mail.inbox()
print mail.unread()
to retrive email element :
print mail.mailbody()
print mail.mailsubject()
print mail.mailfrom()
print mail.mailto()
I am trying to use your examples in web2py DAL. Could you give more info on licensing/credits?
I also would appreciate any advice on unicode and cross service syntax issues (since it appears there is no common implementation of commands on different brands and services).
Hey Alan,
Everything I post should be Beerware : )
What’s this about cross service syntax issues? Across different IMAP services?
The IMAP commands detailed in RFC3501 should be compatible across any service that implements IMAP.
I sent a prototype to web2py issues for adding an IMAP adapter.
http://code.google.com/p/web2py/issues/detail?id=610
I was not aware of the comma separated sequence specification in the IMAP RFC and I thought It was a local implemented Gmail sytax, so I was concerned about using the interface with different servers.
Thanks for the feedback
Regards
Ah ha, you’re right! I only see reference to a sequence-set in the format XX:YY. I’m not sure how widely supported / unsupported the comma syntax is. I’d love to hear about it if you find out!
It depends on the acceptance of the enhancement request and the tests of the users with different server brands, but IMAP RFC does specify the syntax, as you mentioned before:
http://tools.ietf.org/html/rfc3501#section-9
sequence-set = (seq-number / seq-range) *(“,” sequence-set)
; set of seq-number values, regardless of order.
; Servers MAY coalesce overlaps and/or execute the
; sequence in any order.
; Example: a message sequence number set of
; 2,4:7,9,12:* for a mailbox with 15 messages is
; equivalent to 2,4,5,6,7,9,12,13,14,15
; Example: a message sequence number set of *:4,5:7
; for a mailbox with 10 messages is equivalent to
; 10,9,8,7,6,5,4,5,6,7 and MAY be reordered and
; overlap coalesced to be 4,5,6,7,8,9,10.
Thanks again
How can I make it use a while loop to keep showing new emails that come after the latest but do not print the same email repetitively?
wow, awesome, ive been trying to figure this stuff out for days!!! (complete newbie)… thanks again Yuji, beautiful work!… just out of curiosity, do you know of a reference spot for info on python interacting with google voice (for sms reasons)? thanks!
Hi Yuji,
thanks for this article, i have some doubts i would be thankful if you could help me
IMAP Search Keys are as follows:
BEFORE
Messages whose internal date (disregarding time and timezone)
is earlier than the specified date.
ON
Messages whose internal date (disregarding time and timezone)
is within the specified date.
SENTBEFORE
Messages whose [RFC-2822] Date: header (disregarding time and
timezone) is earlier than the specified date.
SENTON
Messages whose [RFC-2822] Date: header (disregarding time and
timezone) is within the specified date.
SENTSINCE
Messages whose [RFC-2822] Date: header (disregarding time and
timezone) is within or later than the specified date.
SINCE
Messages whose internal date (disregarding time and timezone)
is within or later than the specified date.
in the above they are saying about “internal date” what it is?
becoz i did not find any header in the original mail with this name
is internal date different from Date: header?
can you say if i you SENTON which header does it use?
Received: by 10.112.63.19 with SMTP id c19csp82292lbs;
Tue, 21 Feb 2012 22:30:44 -0800 (PST)
Date: Wed, 22 Feb 2012 12:00:42 +0530
date = ‘”22 Feb 2012″‘
when i search for the above like this mail.search(None, ‘SENTON’, date)
it does gives empty result. do you have any idea?
hi..
when i’m executing the first few lines i.e.
******************************************
1
import imaplib
2
mail = imaplib.IMAP4_SSL(‘imap.gmail.com’)
3
mail.login(‘myusername@gmail.com’, ‘mypassword’)
4
mail.list()
5
# Out: list of “folders” aka labels in gmail.
6
mail.select(“inbox”) # connect to inbox.
**************************************************
following error is coming..
++++++++++++++++++++++++++++++++++++++
python sample2.py
Traceback (most recent call last):
File “sample2.py”, line 2, in
mail = imaplib.IMAP4_SSL(‘imap.gmail.com’)
File “/usr/lib/python2.6/imaplib.py”, line 1138, in __init__
IMAP4.__init__(self, host, port)
File “/usr/lib/python2.6/imaplib.py”, line 163, in __init__
self.open(host, port)
File “/usr/lib/python2.6/imaplib.py”, line 1149, in open
self.sock = socket.create_connection((host, port))
File “/usr/lib/python2.6/socket.py”, line 547, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
socket.gaierror: [Errno -2] Name or service not known
++++++++++++++++++++++++++++++++
please please reply and tell me what the problem is.. thanks in advance…
Sounds like your system can’t resolve domains.. Can you even `ping google.com` from a terminal?
thnxxxx fr da code..it was helpful..!
bt if m nt wrong, dis only extracts da latest email..is der any way 2 extract all unread msgs..??
der wer sum network-related issues….bt da codez workin nw..!
tthnxxxx fr da code..it was helpful..!
bt if m nt wrong, dis only extracts da latest email..is der any way 2 extract all unread msgs..??
Thanks . useful code
GET A HEADER KEY ONLY has extra Square Bracket in It ..
Reblogged this on ♫ Gulzar Manzil ♪.
Awesome work! really helpful..thanks yunji 🙂
I have made a post about using imaplib with oauth2..if anyone is interested its available in http://rakeshmukundan.in/2013/01/23/access-gmail-using-imaplib-and-python-with-oauth2/
This is seriously cool !!
Helped me move away the tedious java coding involved.
I’m new to python.when I copied and run your code ( changing user,password), I got a restart in the python shell but not results. I was able to ping gmail so i don’t think it’s a network issue. Please advise.
Thanks
You’ll have to read the error and follow through on it. Typical programming debugging applies… Python tends to spit a traceback.
Open the Debug control on the Python shell. I got the following message;
‘dbd’.run(),line392:exec(cmd,globals,locals)
‘_main_’.(),line 1:Import impalib
> ‘imaplib’.(),line11:””
Under Globals Section
_builtins_
_doc_ None
_ Name_ ‘_main_’
_Package_None
My Comment: What is it missing? If something is missing how can a load into python.
Thank you so much for your assistance.
Thank you for this excellent article. It was exactly what I was looking for. Nicely done.
Same here! Tried this out in the REPL and it went off without a hitch!
Excellent little tutorial. Thank you!
Thank you for this excellent tutorial.
__
Hector
I stumbled on this blog-post and it was helpful to me, but I kept running into problems where data coming from imap.gmail.com was in binary (Python vers: 3.2). I changed the following lines and everything else worked:
id_list = ids.split()
changed to:
id_list = str(ids, encoding=”utf-8″).split()
And
email_message = email.message_from_string(raw_email)
changed to:
email_message = email.message_from_bytes(raw_email)
I have read that email.message_from_bytes() is new in version 3.2, so this might be helpful to other Python 3.2+ users.
Do you have any advice on how to archive an email using imaplib?
I have been searching for something like this for the past 6 months.
It just eased up a project of mine so much that I can’t thank you enough.
Thank you so much! This is extremely helpful! It has also considerably eased my work.
When i tried to connect ‘outlook’ with this imaplib module its giving error like this:
Traceback (most recent call last):
File “”, line 1, in
mail = imaplib.IMAP4_SSL(‘owa.efi.com’)
File “C:\Python26\lib\imaplib.py”, line 1137, in __init__
IMAP4.__init__(self, host, port)
File “C:\Python26\lib\imaplib.py”, line 163, in __init__
self.open(host, port)
File “C:\Python26\lib\imaplib.py”, line 1149, in open
self.sock.connect((host, port))
File “”, line 1, in connect
error: [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
Can any 1 help me with this?
hello deepak, I have the same problem, do you resolve it?
Help me please! i have this problem:
Traceback (most recent call last):
File “email1.py”, line 21, in
from email import email
File “/home/user/Рабочий стол/Py/email.py”, line 38, in
msg = email.message_from_string(raw_email)
AttributeError: ‘module’ object has no attribute ‘message_from_string’
********************************************************************************
dir(email):
[‘Charset’, ‘Encoders’, ‘Errors’, ‘FeedParser’, ‘Generator’, ‘Header’, ‘Iterators’, ‘LazyImporter’, ‘MIMEAudio’, ‘MIMEBase’, ‘MIMEImage’, ‘MIMEMessage’, ‘MIMEMultipart’, ‘MIMENonMultipart’, ‘MIMEText’, ‘Message’, ‘Parser’, ‘Utils’, ‘_LOWERNAMES’, ‘_MIMENAMES’, ‘__all__’, ‘__builtins__’, ‘__doc__’, ‘__file__’, ‘__name__’, ‘__package__’, ‘__path__’, ‘__version__’, ‘_name’, ‘_parseaddr’, ‘base64MIME’, ‘base64mime’, ‘charset’, ’email’, ‘encoders’, ‘errors’, ‘feedparser’, ‘generator’, ‘header’, ‘importer’, ‘iterators’, ‘message’, ‘message_from_file’, ‘message_from_string’, ‘mime’, ‘parser’, ‘quopriMIME’, ‘quoprimime’, ‘sys’, ‘utils’]
from email import email sounds wrong… do you have a module on your site ALSO called email?
i delete (mymodulename).pyc and all work
try msg = email.message_from_bytes(raw_email) instead
Hi can you please let me know this issue is resolved for you??? If so can let me know what is the solution
I’m facing the same issue.
excellent post mate, made me understand parsing and fetching heaps better, keep up the good work
but the line email_message = email.message_from_string(raw_email) should be email_message = email.message_from_bytes(raw_email) or at least for me this fixed it
Hi there, Can you provide code block to extract the first word inside an email with specific subject. For example:
The Mail(this is a subject)
Hello world!!(this is content of the email)
How do i extract Hello meaning how do I extract the first word of the content.
Thank You
Hi Bhishan,
I have a similar request. Were you able to resolve this. Can you post your code here.
Thanks
Kris
Thanks for really nice post. ‘self’ in get_first_text_block causes error 🙂
Very nice. I just spend several hours to figure this stuff out, then found your post, which would’ve saved me most of the work 🙂
I’m stuck on the second part of what I need to do, which is to write emails (which I read from another IMAP server) into gmail. Perhaps you already figured this out? I’m running into
“error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry”
and haven’t figured out to pass SSL the flags that tell it to stop worrying about such details.
This post was very helpful to me. I like to thank you for sharing your knowledge.
This post helped me so much. Thanks for writing this article!
I’m impressed, I have to admit. Rarely do I come across a blog
that’s both educative and entertaining, and without a doubt, you
have hit the nail on the head. The issue is something too few men and women are speaking
intelligently about. I’m very happy that I stumbled across this during my search for something concerning this.
Cannot connect to gmail using SMTP or IMAP
same error occurs.
Traceback (most recent call last):
File “C:/Python33/imap_gmail.py”, line 4, in
mail=imaplib.IMAP4_SSL(‘imap.gmail.com’)
File “C:\Python33\lib\imaplib.py”, line 1214, in __init__
IMAP4.__init__(self, host, port)
File “C:\Python33\lib\imaplib.py”, line 181, in __init__
self.open(host, port)
File “C:\Python33\lib\imaplib.py”, line 1229, in open
IMAP4.open(self, host, port)
File “C:\Python33\lib\imaplib.py”, line 257, in open
self.sock = self._create_socket()
File “C:\Python33\lib\imaplib.py”, line 1217, in _create_socket
sock = IMAP4._create_socket(self)
File “C:\Python33\lib\imaplib.py”, line 247, in _create_socket
return socket.create_connection((self.host, self.port))
File “C:\Python33\lib\socket.py”, line 435, in create_connection
raise err
File “C:\Python33\lib\socket.py”, line 426, in create_connection
sock.connect(sa)
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it.
Is it because I am on a proxy server..??
Proxy Servers are only used for http/https – maybe your firewall is blocking the used ports…
Can you maybe complete tutorial with mail.uid(‘thread’, 16998 command fo regular imap server. From imap lib i can do print mail.thread(‘references’, “utf-8”, ‘(uid 16998)’) and that works great but I get seq number and not uid.
Do you maybe knwo how to use mail.uid(‘thread’, 16998, (‘what to put in here’) ?
Thanks so much for this great post! =D
Thank for this explanation – solved my problem!
How can I click on a certain link inside the email? I really do not get it how. Thank you a lot!
You can’t click a link because you are not using a browser. There is no concept a click. You could parse the results with an HTML parser and find links, then do something with them instead!
Hi Yuji,
Thanks for the excellent example, but I ran into an issue.
I used
mail.uid(‘search’, None, ‘(SUBJECT “Comment posted on”)’)
to retrieve all uids filtered by a subject string, but unfortunately it did not respond with correct emails. Surprisingly
mail.search(None, ‘(sUBJECT “Comment posted on”)’)
did it correctly. What am I doing wrong here? Please advice.
Watch Ajin
episode 1 English Subbed free online
I am getting this error :
Traceback (most recent call last):
File “C:\Users\Pawan\Desktop\TCL\samp.py”, line 3, in
mail.login(‘dhruvhbti@gmail.com’, ‘k@mesh123’)
File “C:\Python34\lib\imaplib.py”, line 538, in login
raise self.error(dat[-1])
imaplib.error: b'[ALERT] Please log in via your web browser: https://support.google.com/mail/accounts/answer/78754 (Failure)’
while execute this script:
import imaplib
mail = imaplib.IMAP4_SSL(‘imap.gmail.com’)
mail.login(‘username@gmail.com’, ‘pwd’)
mail.list()
# Out: list of “folders” aka labels in gmail.
mail.select(“inbox”) # connect to inbox.
Is there a way to delete emails from the inbox using imaplib and much of the syntax you used in the above post?
Excellent post! Very useful, thanks
Hello,
Thanks for a great post.
Can you help me out with one little thing here ?
I am using the below program to login to my Outlook 365 server using IMAP.
import imaplib
mail = imaplib.IMAP4_SSL(‘mail.o365.mailserver.com’)
print mail.login(‘myuserid@domain.com’, ‘MyPassword’)
print(‘Logged in’)
I receive the below error when doing so.
File “C:\Python27\lib\imaplib.py”, line 520, in login
raise self.error(dat[-1]) error: LOGIN failed.
I can login to outlook webmail using the same UserID & Password. But via this program, I receive the abovesaid error.
Please let me know what you think.
Thanks & regards,
Sree
very very cool! thanks!
Great instruction!
I want to forward all unread mail from a mailbox to an external address.
This is the code i created, but its only sending 1 email selected by msgid. What do i need to change to forward (and change subject off) all the messages.
import smtplib, imaplib, email
imap_host = “imap host”
client = imaplib.IMAP4(‘imap_host’)
client.login(user, pwd)
client.select(‘INBOX’)
msgid = 2
status, data = client.fetch(msgid, “(RFC822)”)
email_data = data[0][1]
message = email.message_from_string(email_data)
message.replace_header(“Subject”, “test”)
message.replace_header(“From”, ‘from_address’)
message.replace_header(“To”, ‘to_address’)
smtp = smtplib.SMTP(‘smtp_host’)
smtp.starttls()
smtp.login(user, pwd)
from_addr = “from_address”
to_addr = “to_address”
smtp.sendmail(from_addr, to_addr, message.as_string())
Im already gladfull for all the time you take to help me!
we want to forward all the unseen mails from purchaseorder@company.com to a suplier after changing the subject. Then you know what we want to reach with this script.
hello sir Can you please help me I want to extract the only message from email how can I do that
i want to fetch the mails through uid i mean i should be able to pass the uid as argument in the function and then it should fetch the mail??
Thanks for your great Work. The below code is working fine while running from cmd but facing issue while running from visual code(Anacode 3)
import imaplib
import email
mail = imaplib.IMAP4_SSL(‘imap.gmail.com’)
mail.login(‘karthick.k2@wabco-auto.com’, ‘W@bco123$’)
mail.list()
mail.select(‘Inbox’)
result, data = mail.uid(‘search’, None, “ALL”)
i = len(data[0].split())
for x in range(i):
latest_email_uid = data[0].split()[x]
#id_list = str(ids, encoding=”utf-8″).split()
result, email_data = mail.uid(‘fetch’, latest_email_uid, ‘(RFC822)’)
raw_email = email_data[0][1]
raw_email_string = raw_email.decode(‘utf-8’)
email_message = email.message_from_bytes(raw_email_string)
for part in email_message.walk():
if part.get_content_type() == “text/plain”:
body = part.get_payload(decode=True)
save_string = str(“D:Dumpgmailemail_” + str(x) + “.eml”)
myfile = open(save_string, ‘a’)
myfile.write(body.decode(‘utf-8’))
myfile.close()
else:
continue
I’m facing issue in “email_message = email.message_from_bytes(raw_email_string)”
Can you please help me in resolving this.
Can you also please let us know how to download the pdf attachment from specific mail?
I face difficulties while fetching HTML data in message body. While i store it in local database, it save an unstructured HTML format, Which causes an unstructured view on template. Will anyone help me out to fix this issue?
Can someone please let me know how to search gmail messages that are grouped. So, let’s say I have 1000 emails and are grouped to send 100 email messages in one email, so we will see 10 emails for 1000 email messages. How to retrieve the grouped emails like this ?
Every time I receive mail in this way, an unread letter becomes read. But I would like it to remain unread until I go to webmail. Tell me how to read the headers of mail from whom, subject, etc. without opening the letter itself?
Related Site https://1xslots.africa