Thamtech, Software and Web Development

Gmail to Google Apps Email Migration

I came up with a method for migrating the emails in my personal Gmail (user@gmail.com) email account to my Google Apps (user@thamtech.com) email account. I had a few simple requirements:

  • Every email in the @gmail.com account must be migrated into the @thamtech.com account with all attachments intact.
  • The read/unread status of each email must be maintained.
  • The labels applied to each email must be maintained, whether they were applied by a filter or manually.
    • Certain Google-endorsed migration solutions are only able to maintain message labels that were applied automatically by a filter.
  • The starred/non-starred status of each email must be maintained.
  • The date on migrated emails must be the original date, NOT the date of migration.
    • Certain migrations involving Entourage have had this unfortunate result.
  • The Recipient column when viewing the list of migrated Sent Mail must show the recipients of the emails, NOT my name or "me".

Also, Gmail normally replaces my name with "me" when displaying the sender/receiver of emails. I prefer that the emails display exactly the same, "me," after being migrated, rather than saying "user@gmail.com". Is this too much to ask? No!

I found a solution using imapsync and Amazon EC2 (I suppose any old computer would do, but this gave me a much higher bandwidth connection to Google's servers than I would have had otherwise). Here's a brief overview of my procedure:

  1. Run an Amazon EC2 instance of "Fedora Core 4: Developer," instance ami-26b6534f
  2. SSH into my new instance
  3. Install imapsync and required Perl packages
  4. Build a script called "run-imapsync":
    imapsync --host1 imap.gmail.com \
    --port1 993 --user1 user@gmail.com \
    --passfile1 ./passfile1 --ssl1 \
    --host2 imap.gmail.com \
    --port2 993 --user2 user@domain.com \
    --passfile2 ./passfile2 --ssl2 \
    --syncinternaldates --split1 100 --split2 100 \
    --authmech1 LOGIN --authmech2 LOGIN \
    --regexmess 's/Delivered-To: user@gmail.com/Delivered-To: user@domain.com/g' \
    --regexmess 's/<user@gmail.com>/<user@domain.com>/g' \
    --regexmess 's/Subject:(\s*)\n/Subject: (no--subject)$1\n/g' \
    --regexmess 's/Subject: ([Rr][Ee]):(\s*)\n/Subject: $1: (no--subject)$2\n/g'
    where "user@gmail.com" is your Gmail account and "user@domain.com" is your Google Apps account.
  5. Make the script executable with
    chmod 744 run-imapsync
  6. Create the password files named "passfile1" and "passfile2" that contain the password for the source and destination imap accounts, respectively.
  7. Execute the script

imapsync command

My imapsync command calls for a little explanation.

The --regexmess parameters are regular expressions to apply to each message before it is uploaded to the destination server. The first two change the header email addresses from my old address to the new email address. This makes Google label them as "me" instead of "Tyler" in the web interface of my destination account.

I was getting errors from the script when it tried to upload messages that had no subject (it also had errors uploading emails with subject "Re: ", where there was no real subject other than the prefix). To fix this, I added the next two regular expressions to replace blank subjects with "(no–subject)". It STILL had problems, so I tried "(no--subject)" and it worked. It seems strange, but it worked and I didn't investigate further.

I don't think I had any emails with subject "Fw: " or "FWD: ". If you do, and you are getting errors when the script tries to upload them, try adding a couple more regular expressions to the command to fix "Fw: " subject lines like it does for "Re: ".

You can append an additional argument to the imapsync command, "--folder X", where X is an imap folder to transfer. You could use this if you only want to transfer "[Gmail]\All Mail" or "[Gmail]\Sent Mail", for example. I like to append '--folder "$1"' to my imapscript command in the run-imapscript file, and then execute run-imapscript with a single parameter, like "[Gmail]\All Mail" (including the quotes, so that it treats that string containing a space as one unit, rather than as two separate parameters).

Lockdown in Sector 4

I became impatient and decided I could transfer multiple labels simultaneously by issuing the run-imapsync command multiple times in parallel with different folder parameters. After a while of doing this, I got the dreaded "Lockdown in sector 4" message from Gmail. It did not lock me out of my web interfaces, but it did prevent me from transferring emails through IMAP for a few hours. Once I got back in, I limited myself to running one instance of imapsync at a time.

Multiple Executions

You will almost certainly have to run the imapsync command multiple times before all of your mail is transferred, unless you have just a few emails to begin with. I had to run it probably 20-50 times to get everything transferred (about 450MB). Imapsync exits every once in a while for whatever reason - maybe the IMAP servers kick it off when they get tired of it.

I did have to run it many more times than 20-50 in the course of figuring out the procedure described in this post, but 20-50 seems about what it took once I started fresh using the command described above.

Conclusion

Overall, the migration from my [user]@gmail.com account to [user]@thamtech.com was a complete success, meeting all of the requirements I mentioned at the beginning of this post. If I find some time, I'll work up more detailed instructions or maybe set up an Amazon EC2 ami image that's ready to go.

Comments

Hi Tyler, Finally, a way to properly transfer from gmail to google apps. One question, did it mainain conversation grouping? Also (ok, 2 questions), how much did it end up costing with Amazon EC2? Prices seem more than reasonable. Thanks, William

Hi William, Good questions! It does maintain the conversation grouping. As for the cost, I spent a total of $17.66 on EC2 during the month of March working out this procedure and transferring my 450MB Gmail account to Google Apps. Naturally, this involved a LOT of trial and error, costing many more hours and data transfer bytes than if I could have just run the procedure. I estimate that the cost of transferring my account once would have been under $4. Tyler

Hey thanks for this guide Tyler! It was very helpful! Just something I ran into though, I believe that in certain versions of Perl, you should escape the '@' signs in the regexs, otherwise, you get warnings about the string @gmail being interpolated. At this point, I'm not sure if it will perform the regex properly. At least, this was happening to me with the version of Perl I was running.

Hey, were you able to create the EC2 image like you said? I'm fairly new at Linux and I have trouble installing packages. Currently my best bet is to experiment on my own little Linux box, or wait for google to release a newer syncing manager which takes care of all these issues. But I like how you had complete success with your task! :) Now if only I could figure out basic linux myself lol...

Hi neeral, I haven't had a chance to work on a public EC2 image yet. I'll try to get one up sometime this week.

sweetness, I eagerly await! :)

Anybody has a simpler way to do this? Honestly, for us with no very high computer knowledge, this gets complicated.... Thanks!

great script i've test it and all good just 1 thing the date of msgs appear as the date of migration not the original date of the msg .. in oder when i click on msg and took a look on the header its the show the real date of the msg.. any clue dude ?

MOe, that is strange. The dates worked out fine for me. In fact, it was one of my requirements that the dates showed the date of the message, not the date of migration. Ashley (posted on April 11) mentioned a regex issue with certain versions of Perl. I suspect the version of Perl, imapsync, or one of the many required libraries or modules may be the cause of your date issue. Sorry to neeral and anyone else waiting for an EC2 image, but I have not had a chance to work on that yet. I will try to get to it soon.

Thanks for this - very timely for me. I have just married and want to change my last name, and my main technical barrier has been my firstname.lastname@gmail.com address. It would be terribly convenient if this works.

Not Gmail to Gmail per se, but I found this site while looking for Courier to Gmail migration. Imapsync does have a FAQ that shows some examples of the 'less documented' features: http://www.linux-france.org/prj/imapsync/FAQ But where I got stuck was Gmail's 'special' folders, e.g. my traditional Inbox.Sent -> [Gmail]/Sent Mail. In this case the author's FAQ didn't work for me, but I eventually got it to work as such: --folder Inbox.Sent \ --regextrans2 's/Inbox.Sent/\[Gmail\]\/Sent Mail/'

I am pretty poor so I skipped the Amazon flexible computing step and ran this on one of my home linux boxes, running Ubuntu Hardy Heron. I only had to #apt-get install imapsync make unzip lynx #cpan cpan> install Date::Manip Then drop back to standard prompt and: ./run-imapsync note: I did have to escape the @'s in the regex bit by putting a slash in front, eg: user\@gmail.com Odly enough on first run it says I don't have anywere near the number of messages in my inbox and archive as I do, ah well it is working and thats the main thing. Thanks Tyler.

I noticed that I still had my "--maxage 1" parameter in my run-imapsync script. I didn't intend for it to be there in this post, so I edited it out. Morgan, check the maxage parameter. That could cause the script to show too few messages in your inbox. Tyler

[...] Tyler Ham’s Blog « Gmail to Google Apps Email Migration [...]

Tyler I noticed that max age bit after I grep'ed the man page. It all seems fixed now. I tried to post here but got an error with wordpress. I blogged about it too: http://morganstorey.com/2008/04/movement-at-station-so-i-have-decided.html

Wow, this saved me SO much time! It seemed to error out a lot, but since it will skip already synced messages I just reran it many times. Actually, to make it easier to know where I left off I did each label/folder individually using the --folder option. Thanks for putting this guide together. David

Hey, thanks for the instructions! I did this on a simple Linux (Ubuntu) setup as well. Just wanted to point out that for some reason it doesn't work with the latest version of imapsync (2.52). I'm now running with 2.49 and it looks good. wo.

Did you have any trouble with imapsync truncating emails? I am trying to get imapsync to run with a cron job every night and backup all of my Gmail to my local IMAP server, but it is cutting off the last line of every single email.

imapsync 1.217 saves the day! I previously used 1.252 and 1.249 with the truncating problem.

[...] are basically two solutions atm:Using imapsync Using ruby I don’t really trust the imapsync solution. Won’t import the sent mail [...]

[...] Gmail to Google Apps Email Migration | Tyler Ham’s Blog A slightly technical solution that looks promising. (tags: gmail migration howto) [...]

I hate being Unix illiterate... I could figure this out as i can do a few things, but this would take me a lot of time and wouldn't accomplish my goal of migrating documents as well... though I could probably just download all the documents and re-upload them. Anyone want to set this up for me on a box? :)

Hi there. I would like to transfer mail from my old large gmail account to a new gmail account or similar account (google apps?) or to my desktop. Can you kindly help me or advice on how to do it. I am not a developer. Can I do this script myself? thank you.

Thanks a lot for the script! I'm using it now to transfer ~600MB of mail from my gmail account to my new google apps gmail account. One thing I noticed after it crashed out the first time is that it doesn't appear to be successfully identifying duplicate messages. It just starts over at the beginning, copying all messages over. I've gone to using --folder and --maxage to get through all the email, but this is peculiar. I'm running on Ubuntu Gutsy and have tried both imapsync 1.219 and 1.255. Same result.

[...] Apps without knowing how, which appears to be completely undocumented and unsupported. There are various other methods but none of them worked for me, so here we [...]

I ran into the problem that in Germany (and I think also in the UK), google had to change their service from gmail to googlemail due to some legal crap. You can add the following command to rename the [Google Mail] folders to the required [Gmail] folder: --regextrans2 's/\[Google Mail\]/\[Gmail\]/'

There is a problem with gmail when you have same files in the trash folder. Though imapsync says it's copied, it is in fact not (or it actually is but google doesn't create a label correctly). Basically, just make sure that your trash is empty before syncing. Here's the conversation I had recently with Gilles: --------- Hi Gilles, On Sat, Aug 30, 2008 at 10:08 PM, Gilles LAMIRAL > Ok. > I made more tests. I found this behavior: > > 1) First gm_1 to gm_2 is OK with inbox > > 2) I delete gm_2/inbox > messages go to gm_2/Trash > > 3) Second gm_1 to gm_2 is KO with inbox > gm_1/inbox messages are no longer stored > in gm_2/inbox even if imapsync did it successfuly. > > 4) I empty gm_2/Trash > > 5) Third gm_1 to gm_2 is OK with inbox > gm_1/inbox messages are in gm_2/inbox >

Thank you very much for this post. It was very useful.

Hi Does this script also port over the "Chats" folder? To use this do I need to have google apps premier edition or can I use with the free standard account? Regards

thanks! This worked fine - I am in the process of transferring almost 2GB from my gmail a/c. Ubuntu Hardy, latest imapsync (1.241-1ubuntu1). Just had to modify your script to escape the @

Thanks to Tylers script I was able to move lots of mail from my Gmail account to my Google Apps for Domains account. I found Morgan's comment on the various packages needed for Ubtuntu helpful. Actually I would like to expand his comment a little: there is also a package for the Perl library Date::Manip. I am not a big fan of installing Perl packages via cpan because you need to compile it on any machine, you'll need to have a C compiler, it will not be kept up-to-date on your systems via the Package Manager, etc. If you just want to install all packages via apt-get the line would be: #apt-get install imapsync unzip lynx libdate-manip-perl After that you can run the script straightaway, no need to set up and use cpan.

This is just what i was looking for. Extra props for Morgan and Michiel, I'm probably too computer illiterate to figure out the dependencies myself. Thanks!

hi friends all in the blog is excellent.but it is for IT people like u. Is there any way to copy and store gmail or yahoo mail messages -in and out to a safe place in my Windows XP? I dont prefer Outlook Express. Anxiously waiting for ur response. -krishna

I'd like to do this but it is a bit over my head technical wise. Could you do this for a fee or anyone else?

This is a fantastic, thanks for taking the time to write a post about it. So far it is working great, although I am having one issue. Occasionally imapsync begins consuming 100% of the CPU and I must kill the process. When I restart the script it begins copying the messages from the folders it has already completed, as if they were not already in sync. As these folders are quite large, this is rather time consuming. Is this normal behavior?

Hi. I ran across this post this morning looking to do what is described here. I am a Linux sysadmin, so this solution appeals to me. I didn't, however, wind up using it. I ended up just using my wife's Vista laptop for ease/convenience. I stumbled across http://www.gmail-backup.com and gave it a shot. When I first saw it, I wasn't sure that it would be exactly what I wanted, but this post (http://www.gmail-backup.com/i-want-migrate-google-apps-gmail-account-sta...) assured me that it would meet all of the requirements listed above. Give it a shot. It was a REAL time saver for me.

I used imapsync to migrate my gmail inbox to my google apps account as well and it worked great. Based on imapsync's README, I used a slightly different command line: imapsync \ --host1 imap.gmail.com --ssl1 \ --user1 \ --password1 XXX \ --host2 imap.gmail.com \ --user2 --ssl2 \ --password2 XXX \ --useheader 'Message-Id' --skipsize (the "useheader" and "skipsize" options appear to be necessary for google imap right now) I just ran it from my home machine and it didn't take too long. The whole thing took less than a day for 25k messages / 900 megs of data.

Hi Tyler, This is an excellent post and i am sure you've put in lot of effort inorder to get this working. Hats off to you. I just gone thru the various comments and found that there are many people who are not really computer savvy. It will be great, if we can just create a very simple user interface on the one side to accept the old gmail account info (email and password) and on the other side the new google apps account info with a "Migrate Now" button. We take care of all complexities at the back end. If you want i can be of some help to you. You can use our existing EC2 instance for doing this and one of our existing web applications running on tomcat, so that the end your has to just type in the url http:///migrate. During the migration process, we can also show dynamically to the user, how much has been migrated and how much more time it is going to take. If we do this, then it is going to be HUGE hit across the globe. Please share your thoughts. You can reach me at hasan@mpowerglobal.com

Thanks for the interesting page/script. It seems to be working, but I'm getting strange results which are hard to debug. I've got thousands of emails and nearly 2gb of data, so syncing was a long and repetitive process. What's strange though is that on the source gmail account I'm reported of (4121) unread messages and when looking in the Inbox, I am told I view 1-50 of 11428 emails. On the destination gmail account however I'm reported of (3765) unread messages, and the view is 1-50 of 11585 emails, so even MORE than the original gmail account. The destination account doesn't get any emails from other sources and was empty when I started... Also similar results in Sent Items etc. Weird I'm using the ubuntu hardy version of Imapsync

I updated your comment! Could we take this forward?

If imapsync is downloading my messages locally and then uploading them, can I ask imapsync to also keep a local backup of the mails downloaded besides uploading the mail? Or do I need to modify the PERL scripts?

Currentyl, Google offers free 30 days of the Premier edition. It comes with an IMAP migration tool which works with Gmail (Advanced > Migration from the control panel). So probably the easiest way is to go for the free trial, use then tool to migrate once (which is what most of us need) and then cancel the subscription. Imapsync didn't work for me with the above stated options, but Google's email migration tool worked fine.

[...] email.  There isn't a formal way of doing so via Google, but low and behold I stumbled across a way to do it with Linux!  Consider this an addendum to that post, with complete instructions for those not [...]

When I try to run your script, it always ends up with the following error message: Host imap.gmail.com says it has NO CAPABILITY for AUTHENTICATE LOGIN Error login : [imap.gmail.com] with user [myuser@gmail.com] auth [LOGIN] 3 NO [ALERT] Invalid credentials (Failure) I already double-checked my login data. They work on each IMAP client. I also tried "PASSWORD" as authentication method, causing the same message. Any idea? Regards, Andreas

Tyler, great work, thanks for doing all this. Are you sure imapsync speed is limited mainly by bandwidth? I have been running it on a pretty fast single-core dedicated server, no other significant tasks running, and a very fast home cable modem (4Mbps/17Mbps up/down). Based on the system profile, Imapsync appears to be CPU-bound almost 100% of the time. What do you think?

Great article. I followed the steps when I migrated my own personal mail. However, when I migrated my company's mail, I actually ended up using a service called YippieMove to do the same thing. Didn't really want to babysit IMAPSync to transfer 20 accounts. =)

Thank you very much, saved my life.

I'm lazy, so I plopped it into a script with variables for the email addresses: #! /bin/bash # -*-ksh-*- user1='user@gmail.com' user2='newuser@googleapps' user1e=${user1//@/\\@} user2e=${user2//@/\\@} echo imapsync --host1 imap.gmail.com \ --port1 993 --user1 "$user1" \ --passfile1 ./passfile1 --ssl1 \ --host2 imap.gmail.com \ --port2 993 --user2 "$user2" \ --passfile2 ./passfile2 --ssl2 \ --syncinternaldates --split1 100 --split2 100 \ --authmech1 LOGIN --authmech2 LOGIN \ --useheader "Message-ID" --skipsize \ --regexmess "s/Delivered-To: $user1e/Delivered-To: $user2e/g" \ --regexmess "s///g" \ --regexmess 's/Subject:(\s*)\n/Subject: (no--subject)$1\n/g' \ --regexmess 's/Subject: ([Rr][Ee]):(\s*)\n/Subject: $1: (no--subject)$2\n/g'