Gmail to Google Apps Email Migration
I came up with a method for migrating the emails in my personal Gmail (user@gmail.com) email account to my Google Apps (user@thamtech.com) email account. I had a few simple requirements:
- Every email in the @gmail.com account must be migrated into the @thamtech.com account with all attachments intact.
- The read/unread status of each email must be maintained.
- The labels applied to each email must be maintained, whether they were applied by a filter or manually.
- Certain Google-endorsed migration solutions are only able to maintain message labels that were applied automatically by a filter.
- The starred/non-starred status of each email must be maintained.
- The date on migrated emails must be the original date, NOT the date of migration.
- Certain migrations involving Entourage have had this unfortunate result.
- The Recipient column when viewing the list of migrated Sent Mail must show the recipients of the emails, NOT my name or "me".
- Certain migrations involving involving Entourage or Outlook have had this unfortunate result.
Also, Gmail normally replaces my name with "me" when displaying the sender/receiver of emails. I prefer that the emails display exactly the same, "me," after being migrated, rather than saying "user@gmail.com". Is this too much to ask? No!
I found a solution using imapsync and Amazon EC2 (I suppose any old computer would do, but this gave me a much higher bandwidth connection to Google's servers than I would have had otherwise). Here's a brief overview of my procedure:
- Run an Amazon EC2 instance of "Fedora Core 4: Developer," instance ami-26b6534f
- SSH into my new instance
- Install imapsync and required Perl packages
- Build a script called "run-imapsync":
imapsync --host1 imap.gmail.com \
--port1 993 --user1 user@gmail.com \
--passfile1 ./passfile1 --ssl1 \
--host2 imap.gmail.com \
--port2 993 --user2 user@domain.com \
--passfile2 ./passfile2 --ssl2 \
--syncinternaldates --split1 100 --split2 100 \
--authmech1 LOGIN --authmech2 LOGIN \
--regexmess 's/Delivered-To: user@gmail.com/Delivered-To: user@domain.com/g' \
--regexmess 's/<user@gmail.com>/<user@domain.com>/g' \
--regexmess 's/Subject:(\s*)\n/Subject: (no--subject)$1\n/g' \
--regexmess 's/Subject: ([Rr][Ee]):(\s*)\n/Subject: $1: (no--subject)$2\n/g'
where "user@gmail.com" is your Gmail account and "user@domain.com" is your Google Apps account. - Make the script executable with
chmod 744 run-imapsync - Create the password files named "passfile1" and "passfile2" that contain the password for the source and destination imap accounts, respectively.
- Execute the script
imapsync command
My imapsync command calls for a little explanation.
The --regexmess parameters are regular expressions to apply to each message before it is uploaded to the destination server. The first two change the header email addresses from my old address to the new email address. This makes Google label them as "me" instead of "Tyler" in the web interface of my destination account.
I was getting errors from the script when it tried to upload messages that had no subject (it also had errors uploading emails with subject "Re: ", where there was no real subject other than the prefix). To fix this, I added the next two regular expressions to replace blank subjects with "(no–subject)". It STILL had problems, so I tried "(no--subject)" and it worked. It seems strange, but it worked and I didn't investigate further.
I don't think I had any emails with subject "Fw: " or "FWD: ". If you do, and you are getting errors when the script tries to upload them, try adding a couple more regular expressions to the command to fix "Fw: " subject lines like it does for "Re: ".
You can append an additional argument to the imapsync command, "--folder X", where X is an imap folder to transfer. You could use this if you only want to transfer "[Gmail]\All Mail" or "[Gmail]\Sent Mail", for example. I like to append '--folder "$1"' to my imapscript command in the run-imapscript file, and then execute run-imapscript with a single parameter, like "[Gmail]\All Mail" (including the quotes, so that it treats that string containing a space as one unit, rather than as two separate parameters).
Lockdown in Sector 4
I became impatient and decided I could transfer multiple labels simultaneously by issuing the run-imapsync command multiple times in parallel with different folder parameters. After a while of doing this, I got the dreaded "Lockdown in sector 4" message from Gmail. It did not lock me out of my web interfaces, but it did prevent me from transferring emails through IMAP for a few hours. Once I got back in, I limited myself to running one instance of imapsync at a time.
Multiple Executions
You will almost certainly have to run the imapsync command multiple times before all of your mail is transferred, unless you have just a few emails to begin with. I had to run it probably 20-50 times to get everything transferred (about 450MB). Imapsync exits every once in a while for whatever reason - maybe the IMAP servers kick it off when they get tired of it.
I did have to run it many more times than 20-50 in the course of figuring out the procedure described in this post, but 20-50 seems about what it took once I started fresh using the command described above.
Conclusion
Overall, the migration from my [user]@gmail.com account to [user]@thamtech.com was a complete success, meeting all of the requirements I mentioned at the beginning of this post. If I find some time, I'll work up more detailed instructions or maybe set up an Amazon EC2 ami image that's ready to go.
March 31st, 2008 at 9:06 am
Hi Tyler,
Finally, a way to properly transfer from gmail to google apps.
One question, did it mainain conversation grouping?
Also (ok, 2 questions), how much did it end up costing with Amazon EC2? Prices seem more than reasonable.
Thanks,
William
March 31st, 2008 at 10:49 am
Hi William,
Good questions! It does maintain the conversation grouping.
As for the cost, I spent a total of $17.66 on EC2 during the month of March working out this procedure and transferring my 450MB Gmail account to Google Apps. Naturally, this involved a LOT of trial and error, costing many more hours and data transfer bytes than if I could have just run the procedure.
I estimate that the cost of transferring my account once would have been under $4.
Tyler
April 11th, 2008 at 3:42 pm
Hey thanks for this guide Tyler!
It was very helpful!
Just something I ran into though, I believe that in certain versions of Perl, you should escape the ‘@’ signs in the regexs, otherwise, you get warnings about the string @gmail being interpolated. At this point, I’m not sure if it will perform the regex properly. At least, this was happening to me with the version of Perl I was running.
April 14th, 2008 at 2:11 am
Hey, were you able to create the EC2 image like you said? I’m fairly new at Linux and I have trouble installing packages. Currently my best bet is to experiment on my own little Linux box, or wait for google to release a newer syncing manager which takes care of all these issues. But I like how you had complete success with your task! :) Now if only I could figure out basic linux myself lol…
April 14th, 2008 at 3:04 pm
Hi neeral, I haven’t had a chance to work on a public EC2 image yet. I’ll try to get one up sometime this week.
April 16th, 2008 at 12:46 am
sweetness, I eagerly await! :)
April 18th, 2008 at 4:54 am
Anybody has a simpler way to do this? Honestly, for us with no very high computer knowledge, this gets complicated….
Thanks!
April 19th, 2008 at 3:20 am
great script i’ve test it and all good just 1 thing the date of msgs appear as the date of migration not the original date of the msg .. in oder when i click on msg and took a look on the header its the show the real date of the msg.. any clue dude ?
April 19th, 2008 at 9:54 am
MOe, that is strange. The dates worked out fine for me. In fact, it was one of my requirements that the dates showed the date of the message, not the date of migration.
Ashley (posted on April 11) mentioned a regex issue with certain versions of Perl. I suspect the version of Perl, imapsync, or one of the many required libraries or modules may be the cause of your date issue.
Sorry to neeral and anyone else waiting for an EC2 image, but I have not had a chance to work on that yet. I will try to get to it soon.
April 22nd, 2008 at 12:41 pm
Thanks for this - very timely for me. I have just married and want to change my last name, and my main technical barrier has been my firstname.lastname@gmail.com address. It would be terribly convenient if this works.
April 23rd, 2008 at 1:59 am
Not Gmail to Gmail per se, but I found this site while looking for Courier to Gmail migration.
Imapsync does have a FAQ that shows some examples of the ‘less documented’ features:
http://www.linux-france.org/prj/imapsync/FAQ
But where I got stuck was Gmail’s ’special’ folders, e.g. my traditional Inbox.Sent -> [Gmail]/Sent Mail.
In this case the author’s FAQ didn’t work for me, but I eventually got it to work as such:
–folder Inbox.Sent \
–regextrans2 ’s/Inbox.Sent/\[Gmail\]\/Sent Mail/’
April 25th, 2008 at 8:12 am
I am pretty poor so I skipped the Amazon flexible computing step and ran this on one of my home linux boxes, running Ubuntu Hardy Heron.
I only had to
#apt-get install imapsync make unzip lynx
#cpan
cpan> install Date::Manip
Then drop back to standard prompt and:
./run-imapsync
note: I did have to escape the @’s in the regex bit by putting a slash in front, eg: user\@gmail.com
Odly enough on first run it says I don’t have anywere near the number of messages in my inbox and archive as I do, ah well it is working and thats the main thing.
Thanks Tyler.
April 25th, 2008 at 10:05 am
I noticed that I still had my “–maxage 1″ parameter in my run-imapsync script. I didn’t intend for it to be there in this post, so I edited it out.
Morgan, check the maxage parameter. That could cause the script to show too few messages in your inbox.
Tyler
April 25th, 2008 at 10:45 am
[…] Tyler Ham’s Blog « Gmail to Google Apps Email Migration […]
April 25th, 2008 at 9:23 pm
Tyler I noticed that max age bit after I grep’ed the man page. It all seems fixed now. I tried to post here but got an error with wordpress.
I blogged about it too: http://morganstorey.com/2008/04/movement-at-station-so-i-have-decided.html
June 26th, 2008 at 3:19 pm
Wow, this saved me SO much time! It seemed to error out a lot, but since it will skip already synced messages I just reran it many times. Actually, to make it easier to know where I left off I did each label/folder individually using the –folder option.
Thanks for putting this guide together.
David
June 27th, 2008 at 2:28 am
Hey,
thanks for the instructions! I did this on a simple Linux (Ubuntu) setup as well. Just wanted to point out that for some reason it doesn’t work with the latest version of imapsync (2.52). I’m now running with 2.49 and it looks good.
wo.
July 3rd, 2008 at 11:07 pm
Did you have any trouble with imapsync truncating emails?
I am trying to get imapsync to run with a cron job every night and backup all of my Gmail to my local IMAP server, but it is cutting off the last line of every single email.
July 7th, 2008 at 2:56 am
imapsync 1.217 saves the day! I previously used 1.252 and 1.249 with the truncating problem.
July 16th, 2008 at 4:57 pm
[…] are basically two solutions atm:Using imapsync Using ruby I don’t really trust the imapsync solution. Won’t import the sent mail […]
July 16th, 2008 at 8:36 pm
[…] Gmail to Google Apps Email Migration | Tyler Ham’s Blog A slightly technical solution that looks promising. (tags: gmail migration howto) […]
July 19th, 2008 at 10:56 pm
I hate being Unix illiterate… I could figure this out as i can do a few things, but this would take me a lot of time and wouldn’t accomplish my goal of migrating documents as well… though I could probably just download all the documents and re-upload them.
Anyone want to set this up for me on a box? :)
August 7th, 2008 at 6:07 am
Hi there. I would like to transfer mail from my old large gmail account to a new gmail account or similar account (google apps?) or to my desktop. Can you kindly help me or advice on how to do it. I am not a developer. Can I do this script myself? thank you.
August 8th, 2008 at 5:46 pm
Thanks a lot for the script! I’m using it now to transfer ~600MB of mail from my gmail account to my new google apps gmail account. One thing I noticed after it crashed out the first time is that it doesn’t appear to be successfully identifying duplicate messages. It just starts over at the beginning, copying all messages over. I’ve gone to using –folder and –maxage to get through all the email, but this is peculiar. I’m running on Ubuntu Gutsy and have tried both imapsync 1.219 and 1.255. Same result.
August 20th, 2008 at 10:40 am
[…] Apps without knowing how, which appears to be completely undocumented and unsupported. There are various other methods but none of them worked for me, so here we […]
August 29th, 2008 at 10:31 am
I ran into the problem that in Germany (and I think also in the UK), google had to change their service from gmail to googlemail due to some legal crap.
You can add the following command to rename the [Google Mail] folders to the required [Gmail] folder:
–regextrans2 ’s/\[Google Mail\]/\[Gmail\]/’
August 30th, 2008 at 11:10 am
There is a problem with gmail when you have same files in the trash folder.
Though imapsync says it’s copied, it is in fact not (or it actually is but google doesn’t create a label correctly).
Basically, just make sure that your trash is empty before syncing.
Here’s the conversation I had recently with Gilles:
———
Hi Gilles,
On Sat, Aug 30, 2008 at 10:08 PM, Gilles LAMIRAL
> Ok.
> I made more tests. I found this behavior:
>
> 1) First gm_1 to gm_2 is OK with inbox
>
> 2) I delete gm_2/inbox
> messages go to gm_2/Trash
>
> 3) Second gm_1 to gm_2 is KO with inbox
> gm_1/inbox messages are no longer stored
> in gm_2/inbox even if imapsync did it successfuly.
>
> 4) I empty gm_2/Trash
>
> 5) Third gm_1 to gm_2 is OK with inbox
> gm_1/inbox messages are in gm_2/inbox
>