I'm curious as to why you choose to break out specific headers in the schema.
For example, you have recipients, subject, and sender as JSON fields, when you could have just a headers field with all of them, and even add the rest of the headers in the message.
If it's performance related, you can still have headers as a single json blob and then use generated columns for the specific fields.
For example
CREATE TABLE IF NOT EXISTS "messages" (
"id" INTEGER NOT NULL PRIMARY KEY, -- internal id
"message_id" TEXT NOT NULL, -- Gmail message id
"thread_id" TEXT NOT NULL, -- Gmail thread id
"headers" JSON NOT NULL, -- JSON object of { "header": value },
"subject" TEXT GENERATED ALWAYS AS (json_extract("headers", '$.Subject')) VIRTUAL NOT NULL)
...
);
CREATE INDEX subjectidx on messages(subject);
I've found this model really powerful, as it allows users to just alter table to add indexed generated columns as they need for their specific queries. For example, if I wanted to query dkim status, it's as simple as
ALTER TABLE messages ADD dkim TEXT GENERATED ALWAYS AS (json_extract("headers", '$."Dkim-Signature"')) VIRTUAL NOT NULL);
CREATE INDEX dkimidx on messages(dkim);
SELECT dkim, COUNT(0) FROM messages GROUP BY dkim;
or whatever you want.
Hakkin 2 days ago [-]
Note that you don't actually need the generated column either, SQLite supports indexes on expressions, so you can do, for example,
CREATE INDEX subjectidx ON messages(json_extract(headers, '$.Subject'))
and it will use this index anywhere you reference that expression.
I find it useful to create indexes like this, then create VIEWs using these expressions instead of ALTER'ing the main table with generated columns.
What a great timely tip. Was just looking for good direction on how to do this. Thanks!
tqi 1 days ago [-]
Adding indexes to support a one off query seems like bad practice?
In general I prefer break out columns that I expect to have/use consistently, especially for something as stable as email headers. Maybe schema changes are a bit easier with a headers column, but imo its just trading the pain on write for pain on read (while leaving the door open to stuff failing silently).
timeinput 1 days ago [-]
I reach for a similar pattern a lot with postgres as I'm building up a system. Start with a think about the fields I know I want, and create the tables with them, and then store all the metadata I have lying around in a json column, then in 2 months when I realize what fields I actually need populate them from json, and then make my API keep them up to date, or make a view, or what ever.
I've found it really helpful to avoid the growing pains that come with "just shove it all in mongo", or "just put it on the file system", but not much cost.
dotancohen 2 days ago [-]
I see that you defined the `dkim` column as NOT NULL. So what happens when an email message does not contain the Dkim-Signature header?
hun3 2 days ago [-]
Probably something like
Error: stepping, NOT NULL constraint failed: messages.dkim (19)
Hey this is really neat! It's like those disk usage visualizers, except that it seems to focus on the total volume of the mail rather than the disk usage.
Is there a size option too? To see which senders are using most of my storage.
(Also your website's SSL certificate has expired.)
terhechte 2 days ago [-]
No currently not. It would be easy to add though. I haven't updated the tool in a while (after using it to clean up my Gmail inbox). Thanks for pointing out the certificate!
Funnily enough, the gmvault.org domain _that_ page points to is simply a parked GoDaddy placeholder. It's also not been updated in 10+ years except for two non-source files.
nijave 1 days ago [-]
This looks interesting. I've DIY'd something similar with qdirstat before but you need to arrange your emails a certain way like dated folders and can't re-slice with different criteria.
On the other hand, qdirstat "cache" files are really easy to generate so can be used for visualizing a bunch of file-like things
the_mitsuhiko 2 days ago [-]
I really lament that you cannot sign in even with an application specific password any more and you need to get an oauth client and go through an oauth flow. It’s my email, but Google takes away an open standard even for myself to access it.
sdoering 2 days ago [-]
Given the amount of spam I receive on my free Gmail addresses (compared to my paid for freelance one), and the amount of spam I receive from Gmail servers on my non Gmail-E-Mail accounts I get more and more inclined towards degoogling myself.
Especially as I receive more and more information that my freelance e-mail is put into spam by recipient systems.
Not sure how to get rid of my Google ecosystem routines, though. Feels daunting.
redeeman 1 days ago [-]
step 1: extract data
step 2: just dont use google shit anymore. Deal with it.
you dont get it done by moping about it, but by doing
someguydave 1 days ago [-]
It would also help if you did step 0: buy your own email domain
cowboylowrez 1 days ago [-]
Personally, I love sending emails nobody will receive, it removes inhibitions and lets me speak my mind without regrets!
codazoda 1 days ago [-]
This isn’t as hard as you might think. I pay for https://mailwip.com because the founder helped me figure out mine. It was ultimately relatively strait-forward. I stay because I appreciate his work, my email is flawless, and I like the logs they provide.
acheong08 1 days ago [-]
I've been self hosting email for a few years at this point and haven't had any delivery issues. Just make sure you set up all your DNS correctly and avoid polluted IP ranges like DO or AWS
kasey_junk 1 days ago [-]
Sorry, why do you consider app specific passwords an open standard but oauth not?
simonw 1 days ago [-]
POP3/IMAP work with any client that supports those protocols.
OAuth really doesn't. Every OAuth integration I've ever built always feels like it needs a tiny bit of custom development.
Also the OAuth flow is usually absolutely horrible for when you're trying to get a token for accessing your own data. I've had to spin up a temporary web app to handle a hunch of redirects just to get my own token!
sir 1 days ago [-]
I built a proxy a while ago to make this easier - it lets you stick with IMAP/POP/SMTP as-is. No need for your client to even know that OAuth exists. See here: https://github.com/simonrob/email-oauth2-proxy
the_mitsuhiko 1 days ago [-]
> No need for your client to even know that OAuth exists
Yes, you can do that, however the problem is getting a client_id/client_secret in the first place. You need to register yourself for one, you need to (nowadays) whitelist every single account or go through a google verification process. At one point you could apply for a client_id that allowed anyone to use it, but that ship has sailed.
kasey_junk 1 days ago [-]
So that’s an argument about a protocol preference not an open ness one. Which frankly makes a lot of sense and wouldn’t have confused me.
the_mitsuhiko 1 days ago [-]
> So that’s an argument about a protocol preference not an open ness one.
Just to make sure the differences are clear: with username and password and IMAP I can use an RFC standardized protocol to sign into an inbox and I do not need Google's permission. The oauth flow they have is neither standardized (XOAUTH2 is not a standard as far as I know at least), requires provider specific logic (Outlook is different to Google) and most importantly requires me to get Google's permission to sign in. I need to get a client_id with the necessary scope, and that is only granted after a review by Google. [1]
[1]: asterisk is that a development only app can authenticate up to 100 users, and those users need to be explicitly whitelisted in the dev panel.
tptacek 1 days ago [-]
That's an appeal to IETF canon, which might be a valid concern (I wouldn't share it, as an opponent of the IETF) but remains orthogonal to "openness". A protocol is open if it's published and especially if it's widely used, which this configuration is.
the_mitsuhiko 14 hours ago [-]
I think you're hyper focused on a point I wasn't making.
isaachinman 1 days ago [-]
Sorry, I don't quite get the point you're trying to make...
With an app password you have full IMAP access.
the_mitsuhiko 14 hours ago [-]
> With an app password you have full IMAP access.
App passwords no longer exist on Google.
isaachinman 14 hours ago [-]
That is 100% untrue. I've built functionality on app passwords and use them on a daily basis.
the_mitsuhiko 8 hours ago [-]
On my Google Workspace org, app passwords don't even show up in the user account settings any more. The documentation says they are recommended against, and the IMAP docs say only oauth2 is supported. However I just found posts that suggest that you can still access them if you navigate directly to the app passwords page. I will try.
isaachinman 7 hours ago [-]
We're using app passwords for Marco (https://marcoapp.io), until we get through the unnecessarily-rigorous process of gaining privileged OAuth access for email scopes.
Every year the imap option ("app passwords") gets buried deeper and deeper in the settings.
isaachinman 1 days ago [-]
Indeed. Quite a hassle to enable now. Multiple requirements including 2FA.
bradgessler 1 days ago [-]
The steps Google makes people jump through just for API keys are absolutely insane.
Does anybody have insight as to why it’s so bad?
victorbjorklund 1 days ago [-]
Probably because if you get API access to someones email account it is game over. And people are stupid so some of them are going to click yes to some scammy app. And then they will blame Google for not protecting them.
IMTDb 1 days ago [-]
Because otherwise tons of people anonymously create api keys with extremely wide scopes for small / low quality apps.
When those inevitably get used for nefarious purposes; Google image suffers as a result.
rantingdemon 8 hours ago [-]
Interesting tool. I'm trying it out now. I had to jump through some hoops in Google's admin panel that probably had me creating some OAuth org for my personal account...
It is now syncing my messages, but very slowly. Some Async magic could probably be cool :)
TekMol 2 days ago [-]
Shouldn't this be "imap to sqlite" or something? Why tie it to one specific email provider?
isaachinman 2 days ago [-]
Because _it is_ specific to Gmail. It's using OAuth and presumable API access.
IMAP is much harder, and much slower, and is bound by Google's bandwidth limits.
pastage 2 days ago [-]
Doing a mbox export with Google Takeout from gmail is pretty fast.
remram 1 days ago [-]
What? You have to schedule it, they literally wait 3 days before they start it, and then it can take most of a day to get it ready for download. It is not fast.
kilroy123 1 days ago [-]
I've never had to wait that long. I usually can download within 20 minutes and it's 15 GB of data.
crazygringo 1 days ago [-]
I've never had that experience. You don't need to schedule anything, and it takes maybe part of an hour to be ready to download?
Maybe there have been times when it was broken or under high demand though?
phh 1 days ago [-]
FWIW, for several years I've tried backuping my gmail account with imap (including some stuff made specifically for gmail): It never succeeded. The best syncer were running for one month, and after one month it hit some mails that it simply couldn't retrieve? Like I guess it was in too cold storage and timeout-ed? I don't know.
So I can understand why using Google's proprietary API might work better (or not, I don't know)
Anyway, as a sibling says, nowadays Google Takeout includes mbox and work properly (and is pretty fast, like half a day), but doesn't allow continuous update.
And I migrated to another mail provider (infomaniak), and I've thanked myself for using my own mail domain name years earlier.
pertique 1 days ago [-]
I had the same problem when I switched off Google. I didn't have a ton of data, and I just wanted content for past search purposes, so I didn't dig into how the data would be transformed but I can at least offer my scuffed solution.
I installed a third-party client (Thunderbird, but I imagine any would work) on a local box, signed in with both emails, and just copied the mail over from one to the other. Low-tech, but it worked quite well. I may have forced some local cache/download for the original email, but I can't recall. I'll check later if it preserves headers and the like. I assume it would, but it wasn't that important to me.
I actually thought about writing at some point about the process of getting off gmail and all the funny things I ran across.
Yep. I did the same to group by domain and sender.
yread 2 days ago [-]
Would be nice to enable fulltext search as well
padjo 2 days ago [-]
Yes! I find gmail’s full text search surprisingly bad given it’s run by a search company.
porker 2 days ago [-]
But not as bad as Outlook 365's search...
pastage 1 days ago [-]
Outlook must be the worst email client there is. Something about least common denominator.
jbverschoor 2 days ago [-]
Boot as bad as the Mail.app from iOS and macOS
NelsonMinar 1 days ago [-]
It got a lot worse recently when they added bad AI to it. Now it does dumb synonyms. Like I search for "doctorate" and it starts highlighting every instance of D alone, like the word "he'd". (Presumably trying to pick up Ph.D.?) For awhile searches for "A" would have it highlighting "the", too.
isaachinman 2 days ago [-]
Agreed! One of the reasons we started working on Marco.
It's the least bad search--better than Yahoo, and Thunderbird desktop in my direct experience. However, I don't download the full message into Thunderbird out of fear of blasting through gmail bandwidth limits.
padjo 13 hours ago [-]
I’ve searched for a fairly rare word pair in the subject line of an email I want and gotten 20 irrelevant results. The only way to find it was by filtering senders and dates.
The schema from AOX always looked really good to me, but I never have gotten to really giving it a try. I wanted to use it, primarily, to get analytics about my mail and for search (not a daily-driver IMAP server).
natmaka 19 hours ago [-]
I'm reminded of Manitou-Mail, a daily-driver powerful PostgreSQL-based dedicated mail client, quite robust and powerful.
Thanks for the note about that. That's an interesting-looking take on the problem.
ThinkBeat 1 days ago [-]
What is the cost for bandwidth here?
As someone with a 40GB+ Gmail account,
will I get billed for the transfer using
this tool?
It is easy to fix though, since I can get
Google Take Out (is that the name?)
which I think is free and then parse
file files once downloaded.
Still using this tool would be faster from
a get it going perspective.
shinryuu 21 hours ago [-]
Would have been nice if you supported google takeout with mbox instead.
flashblaze 10 hours ago [-]
Is there anything similar written in TypeScript?
1vuio0pswjnm7 19 hours ago [-]
Wasn't there a period where one could get an XML feed of their Gmail, many years ago.
vladgur 24 hours ago [-]
Awesome!
Feature request: parse email content to extract unsubscribe links and allow me to unsubscribe from most frequent senders easily
alimbada 1 days ago [-]
I did something similar using Got Your Back and some C# hacked together in LinqPad to help me analyse my emails.
flas9sd 1 days ago [-]
having sqlite exporters for platforms is great help for archiving, but also general questions: I used https://github.com/ltdangle/mail2db to see how much mail volume I still receive monthly on a mail account that I want to move away from. A top10 of senders directed my un- and resubscribe actions.
gitroom 1 days ago [-]
man, the whole gmail backup mess reminds me why i avoid locking myself into someone else's sandbox. figuring out what actually keeps me sticking with a platform even when i know all the downsides - is it just laziness or something deeper?
noer 1 days ago [-]
This is just a single table DB though? At that point, why not just export to a csv or dataframe or whatever and leverage analysis packages to analyze whatever you wanted to.
I admittedly might just not have or understand the use case nor have I thought about how large a Gmail account actually is so feel free to ignore if I'm missing something!
hiAndrewQuinn 1 days ago [-]
A couple of reasons which pop to mind:
- Searching a plain text data file is O(n). Searching a SQLite database that has been properly indexed, which is very easy to do nowadays with FTS5, is O(log n) worst case scenario and O(1) in the best case. This doesn't explain why SQLite over a dataframe or anything, but it definitely justifies it over plain text for large email collections.
- SQLite is really easy to write custom views and programs around. Virtually every major programming language can work with it without issue. See also: simonw's wonderful https://datasette.io/ .
- SQLite is an accepted archival format by the Library of Congress, if you ever want to go down the rabbit hole of digital preservation.
jokoon 2 days ago [-]
I would have preferred a script that parses the mail backup Google sends you.
I think it's a big eml file.
ukuina 2 days ago [-]
Google Takeout regularly fails to complete for me. Syncing via the API seems like a reasonable alternative.
oulipo 1 days ago [-]
What's the best open-source GMail backup software that exists? Someone has setup something like that? (also archiving attachments, etc)
For a long time and it's worked great. But it seems like GYB is actively maintained, so maybe I should switch.
karteum 22 hours ago [-]
FWIW, I used gmvault a long time ago (N.B. I typically deactivated individual .eml.gz compression in favor of a more global compression with a squashfs archive of the gmvault backup). Since I found not very practical to search through that archive, I developed https://github.com/karteum/gmvaultdb in order to convert it to a local sqlite DB (I made it for personal use and chose to keep the schema simple to have simple queries for my case. If the DB was to grow bigger I'd probably use a different schema e.g. to place the contacts/from/to/cc in separate tables... I also chose to extract attachments directly on the filesystem rather than put them in the DB - which enable direct access and deduplication)
(N.B. it can also process the mbox produced by https://takeout.google.com/ with the caveat that in some case Takeout permanently losses some information because of a bug with older encodings so I'd always prefer a backup using gmvault or imap)
oulipo 1 days ago [-]
Thanks!
sbarre 1 days ago [-]
This isn't exactly what you're asking for, but Google offers a service called Takeout that lets you request and download backups of all your data from their services, including Gmail.
I have a reminder to trigger this every few months and update my local backup. If I recall it comes as a gzipped mbox file.
oulipo 15 hours ago [-]
yes but I'd rather do it "continuously", the issue I want to back against is that Google locks me out of my account for some random reason
nijave 1 days ago [-]
You can also use an IMAP client and set it to offline/download mode so it downloads everything and saves it locally. I think "offline mode" is what it's called in Evolution--not sure what Thunderbird or other clients call it.
pdyc 2 days ago [-]
this is great if only there was a tool for whatsapp to sqlite it would make my data so much more useful
in both iOS and Android it's all stored in sqlite already. Table schemas circulate forensics blogs and QA sites and how to obtain the unencrypted db
hamburglar 1 days ago [-]
This looks great and simple. I’ll likely try it out. Any chance you’re working on including attachment metadata (and/or broken out access to the attachments themselves) in the future?
frshOffTheBoat 23 hours ago [-]
Oh wow, this is possible - thanks!
pdimitar 2 days ago [-]
Would love a comparison to gbackup-rs[0].
To me having to install a tool through Python is a show-stopper.
Does that tool still work? It's not had any updates in 3 years and it looks like it uses IMAP, which may not be available for new Gmail accounts now that they're moving away from even per-app passwords.
pdimitar 1 days ago [-]
My account is very old and the tool works on it. Figured it will probably stop working at some point, thanks for the refresh.
I have that tool activating once every 24h still, to this day.
curtisszmania 1 days ago [-]
[dead]
einpoklum 2 days ago [-]
Let us stop using GMail:
* Google collects vast amounts of personal data, specifically through receiving all of your email and analyizing it.
* It builds elaborate user profiles and uses them to target you with ads designed to better influence you.
* Its hold on information (from different sources) has made it excessively powerful economically, and thus also politically.
* Google/Alphabet has long started to affect legislation, including through direct registered lobbying: ~15 Million USD in 2024 (opensecrets.org).
* It has been known to pass, and likely still passes, the information it collects - including copies of your email correspondence - on to the US government (Edward Snowden leaks).
and finally:
* There are multiple email providers, many of them quite good - both for pay and gratis. Naturally most of the gratis ones have their own interests, but nothing like Google.
This should be seen as a encouraging to switch to something else rather than defeatist. Many of my communications do not touch Google services, professionally it has been judged as too risky, personally I keep a google account but also others.
Edit: You can create groups of people that are not affected by Google/Apple/Facebook, this should be seen as a goal.
devrandoom 1 days ago [-]
This could be an interesting take, GDPR wise. Google handling personal emails of people that have no contract or business relationship with Google.
remram 1 days ago [-]
What's a good replacement? Needs to work on web and mobile (or desktop and mobile), have search, have labels, have automated filters.
KawaiiCyborg 18 hours ago [-]
I've been using Fastmail.com since December, 2014 now and I'm still very happy with them. I use them with a few of my own domains and I've never had any issue sending or receiving any email. Recently, they even got a 1Password integration that allows you to directly create a masked email for a service from my 1Password UI.
Search works perfectly, labels feel good to use and the filters are very flexible. If the UI doesn't allow you what to do, you could even directly write your own Sieve scripts.
bob1029 1 days ago [-]
I've been using AWS WorkMail since it was released. I prefer this arrangement because I can administer the related domain & DNS concerns in the same place.
$4/m seems nominal for a 50GB mailbox with no weird adtech shit built in.
remram 1 days ago [-]
Leaving Google for Amazon makes no sense to me, Amazon fits GP's list just as well.
bob1029 1 days ago [-]
This elevation of products into their parent organizations makes no sense to me.
Gmail != Google
WorkMail != Amazon
Gmail is targeted at consumers and is engineered to suck up your data to pay for itself. WorkMail is targeted at businesses and is engineered to not piss off IT administrators and middle management.
remram 1 days ago [-]
That's fair, still if I put in the effort to migrate, I would rather not do it to another company whose business is selling eyeballs. Amazon's business (maybe not WorkMail? who knows) is to sell you a maximum of stuff on their marketplace, building a detailed profile of you to recommend you more stuff.
I could also pay for a Google Workspace and stay with Gmail.
einpoklum 5 hours ago [-]
First, I believe you're making the wrong assumption that a good replacement is web-based. The web is a poor mechanism interface for accessing email. Use a mail client! That will work on PCs, laptops and mobile apps - and you don't depend on your email provider for your usage experience; they all offer IMAP (and many offer POP3). That means you'll have search, labels, elaborate automated filters - including custom actions on your computer, which no web-based interface will offer you (I think). Most email providers do offer a web interface, and some of those are pretty nice, but I'd always consider them a fallback.
Anyway, among the popular providers, I am partial to ProtonMail:
it's Swiss, run by a non-profit, and originally crowd-funded. Over 100 Million users, IIANM. I am not a cyber-security expert, so I can't claim to have audited them for security or privacy bona fides (but maybe somebody has?).
I've also heard suggestions to try Tuta (tuta-nota), but have never tried it myself.
There are also many smaller providers. Specifically, many Internet access providers also provider email services. Not that they are super trustworthy, but - there's a good chance you're not just hading everything over to one of the behemoths like Google or Microsoft.
If you do end up on one of the bigger providers - it's probably best to be on one that's not linked to the government where you live. So if you're in a NATO country you could go for yandex or mail.ru and if in Russia then maybe GMX?
Unfortunately - wherever you take your email - when you write a GMail address, Alphabet has a hold of your correspondence again. So, we need to convince our friends to ditch Google as well.
remram 4 hours ago [-]
I made no such assumption, I wrote "web (or desktop)". Web is what I have now.
A problem with desktop clients is that they are usually generic IMAP clients, meaning they have limited support for server-side search, non-standard features like labels, and creating server-side filters.
phantompeace 1 days ago [-]
What are the best free replacements to gmail that are realistic to switch to? I.E well established and not poised to close down any time soon
justin_oaks 1 days ago [-]
You may have to temper your expectations. Free usually means "sells/uses your data to offset costs". If you're OK with that, there's no need to switch off of GMail. If you're not OK with that, you'll have to pay.
Also, hosting email under your own domain gives you the freedom to move from one email provider to another even if they do shut down.
I put my money where my mouth is. I wanted to degoogle and so pay $50/year for Fastmail. One feature I like is automatically snoozing certain emails. Most of my non-personal email is automatically snoozed until 6pm every day. This way I don't get multiple notifications throughout the day for emails that aren't time sensitive.
11 hours ago [-]
mediumsmart 2 days ago [-]
I was with you from day one and never started using gmail.
For example, you have recipients, subject, and sender as JSON fields, when you could have just a headers field with all of them, and even add the rest of the headers in the message.
If it's performance related, you can still have headers as a single json blob and then use generated columns for the specific fields.
For example
I've found this model really powerful, as it allows users to just alter table to add indexed generated columns as they need for their specific queries. For example, if I wanted to query dkim status, it's as simple as or whatever you want.I find it useful to create indexes like this, then create VIEWs using these expressions instead of ALTER'ing the main table with generated columns.
In general I prefer break out columns that I expect to have/use consistently, especially for something as stable as email headers. Maybe schema changes are a bit easier with a headers column, but imo its just trading the pain on write for pain on read (while leaving the door open to stuff failing silently).
I've found it really helpful to avoid the growing pains that come with "just shove it all in mongo", or "just put it on the file system", but not much cost.
https://github.com/terhechte/postsack
Is there a size option too? To see which senders are using most of my storage.
(Also your website's SSL certificate has expired.)
Funnily enough, the gmvault.org domain _that_ page points to is simply a parked GoDaddy placeholder. It's also not been updated in 10+ years except for two non-source files.
On the other hand, qdirstat "cache" files are really easy to generate so can be used for visualizing a bunch of file-like things
Especially as I receive more and more information that my freelance e-mail is put into spam by recipient systems.
Not sure how to get rid of my Google ecosystem routines, though. Feels daunting.
you dont get it done by moping about it, but by doing
OAuth really doesn't. Every OAuth integration I've ever built always feels like it needs a tiny bit of custom development.
Also the OAuth flow is usually absolutely horrible for when you're trying to get a token for accessing your own data. I've had to spin up a temporary web app to handle a hunch of redirects just to get my own token!
Yes, you can do that, however the problem is getting a client_id/client_secret in the first place. You need to register yourself for one, you need to (nowadays) whitelist every single account or go through a google verification process. At one point you could apply for a client_id that allowed anyone to use it, but that ship has sailed.
Just to make sure the differences are clear: with username and password and IMAP I can use an RFC standardized protocol to sign into an inbox and I do not need Google's permission. The oauth flow they have is neither standardized (XOAUTH2 is not a standard as far as I know at least), requires provider specific logic (Outlook is different to Google) and most importantly requires me to get Google's permission to sign in. I need to get a client_id with the necessary scope, and that is only granted after a review by Google. [1]
[1]: asterisk is that a development only app can authenticate up to 100 users, and those users need to be explicitly whitelisted in the dev panel.
With an app password you have full IMAP access.
App passwords no longer exist on Google.
Here's the support article we link our users to:
https://support.google.com/accounts/answer/185833
Workspace is a bit different, however. You need an admin to enable app passwords.
Gmail to SQLite describes 6 steps to get credentials working, but it is not true for me. After 6 steps:
- that Google said that my app was not published, so I published it
- Google said that app cannot be internal, because I am not a workspace user
- for external apps
- then it said I cannot use the app until it is verified
- in verification they wanted to know domain, address, other details
- they wanted to have my justification for scopes
- they wanted to have video explaining how the app is going to be used
- they will take some time to verify the data I provided them
It all looks like a maze of settings, where requiring any of users to go above the hoops required by Google is simply too much.
Links:
[0] https://github.com/rumca-js/Django-link-archive
Don't jump through their hoops.
Does anybody have insight as to why it’s so bad?
When those inevitably get used for nefarious purposes; Google image suffers as a result.
It is now syncing my messages, but very slowly. Some Async magic could probably be cool :)
IMAP is much harder, and much slower, and is bound by Google's bandwidth limits.
Maybe there have been times when it was broken or under high demand though?
So I can understand why using Google's proprietary API might work better (or not, I don't know)
Anyway, as a sibling says, nowadays Google Takeout includes mbox and work properly (and is pretty fast, like half a day), but doesn't allow continuous update.
And I migrated to another mail provider (infomaniak), and I've thanked myself for using my own mail domain name years earlier.
I installed a third-party client (Thunderbird, but I imagine any would work) on a local box, signed in with both emails, and just copied the mail over from one to the other. Low-tech, but it worked quite well. I may have forced some local cache/download for the original email, but I can't recall. I'll check later if it preserves headers and the like. I assume it would, but it wasn't that important to me.
I actually thought about writing at some point about the process of getting off gmail and all the funny things I ran across.
https://imapsync.lamiral.info/
https://marcoapp.io
The schema from AOX always looked really good to me, but I never have gotten to really giving it a try. I wanted to use it, primarily, to get analytics about my mail and for search (not a daily-driver IMAP server).
https://www.manitou-mail.org/
It is easy to fix though, since I can get Google Take Out (is that the name?) which I think is free and then parse file files once downloaded.
Still using this tool would be faster from a get it going perspective.
Feature request: parse email content to extract unsubscribe links and allow me to unsubscribe from most frequent senders easily
I admittedly might just not have or understand the use case nor have I thought about how large a Gmail account actually is so feel free to ignore if I'm missing something!
- Searching a plain text data file is O(n). Searching a SQLite database that has been properly indexed, which is very easy to do nowadays with FTS5, is O(log n) worst case scenario and O(1) in the best case. This doesn't explain why SQLite over a dataframe or anything, but it definitely justifies it over plain text for large email collections.
- SQLite is really easy to write custom views and programs around. Virtually every major programming language can work with it without issue. See also: simonw's wonderful https://datasette.io/ .
- SQLite is an accepted archival format by the Library of Congress, if you ever want to go down the rabbit hole of digital preservation.
I think it's a big eml file.
- Open source
- Resume (so backups/restores will eventually complete)
Honorable mention: https://www.mailstore.com/en/products/mailstore-home/
- Not open source
- GUI with index: nice for searching mail locally
- Resume only for backup (so large restores generally fail)
https://github.com/gaubert/gmvault
For a long time and it's worked great. But it seems like GYB is actively maintained, so maybe I should switch.
(N.B. it can also process the mbox produced by https://takeout.google.com/ with the caveat that in some case Takeout permanently losses some information because of a bug with older encodings so I'd always prefer a backup using gmvault or imap)
I have a reminder to trigger this every few months and update my local backup. If I recall it comes as a gzipped mbox file.
To me having to install a tool through Python is a show-stopper.
[0] https://github.com/djipko/gbackup-rs
I have that tool activating once every 24h still, to this day.
* Google collects vast amounts of personal data, specifically through receiving all of your email and analyizing it.
* It builds elaborate user profiles and uses them to target you with ads designed to better influence you.
* Its hold on information (from different sources) has made it excessively powerful economically, and thus also politically.
* Google/Alphabet has long started to affect legislation, including through direct registered lobbying: ~15 Million USD in 2024 (opensecrets.org).
* It has been known to pass, and likely still passes, the information it collects - including copies of your email correspondence - on to the US government (Edward Snowden leaks).
and finally:
* There are multiple email providers, many of them quite good - both for pay and gratis. Naturally most of the gratis ones have their own interests, but nothing like Google.
Edit: You can create groups of people that are not affected by Google/Apple/Facebook, this should be seen as a goal.
$4/m seems nominal for a 50GB mailbox with no weird adtech shit built in.
Gmail != Google
WorkMail != Amazon
Gmail is targeted at consumers and is engineered to suck up your data to pay for itself. WorkMail is targeted at businesses and is engineered to not piss off IT administrators and middle management.
I could also pay for a Google Workspace and stay with Gmail.
Anyway, among the popular providers, I am partial to ProtonMail:
https://en.wikipedia.org/wiki/Proton_Mail
it's Swiss, run by a non-profit, and originally crowd-funded. Over 100 Million users, IIANM. I am not a cyber-security expert, so I can't claim to have audited them for security or privacy bona fides (but maybe somebody has?).
I've also heard suggestions to try Tuta (tuta-nota), but have never tried it myself.
There are also many smaller providers. Specifically, many Internet access providers also provider email services. Not that they are super trustworthy, but - there's a good chance you're not just hading everything over to one of the behemoths like Google or Microsoft.
If you do end up on one of the bigger providers - it's probably best to be on one that's not linked to the government where you live. So if you're in a NATO country you could go for yandex or mail.ru and if in Russia then maybe GMX?
Unfortunately - wherever you take your email - when you write a GMail address, Alphabet has a hold of your correspondence again. So, we need to convince our friends to ditch Google as well.
A problem with desktop clients is that they are usually generic IMAP clients, meaning they have limited support for server-side search, non-standard features like labels, and creating server-side filters.
Also, hosting email under your own domain gives you the freedom to move from one email provider to another even if they do shut down.
I put my money where my mouth is. I wanted to degoogle and so pay $50/year for Fastmail. One feature I like is automatically snoozing certain emails. Most of my non-personal email is automatically snoozed until 6pm every day. This way I don't get multiple notifications throughout the day for emails that aren't time sensitive.