Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Domain design for backend storage? #1

Closed
xeoncross opened this issue Mar 2, 2019 · 12 comments
Closed

Domain design for backend storage? #1

xeoncross opened this issue Mar 2, 2019 · 12 comments
Assignees
Labels
enhancement New feature or request

Comments

@xeoncross
Copy link

xeoncross commented Mar 2, 2019

Instead of combining the sql queries inside the backend logic, would it be possible to create domain objects so that multiple backends (SQL, key/value, disk, AWS storage, Memory, etc...) could be written to use the great backend logic you're pulling together?

In other words, simply provide a type BackendStorage interface{ ... } to whatever logic you're working on so that multiple storage solutions could be built and used.

Then again, I realize it's called "go-sqlmail", but you're building a needed tool and wanted to mention this as soon as possible to see if it could be a goal.

@foxcpp
Copy link
Owner

foxcpp commented Mar 3, 2019

Backend interface defined by go-imap is already a storage abstraction. I don't think it is a good idea to add another abstraction for the same purpose. There is not much "backend logic" in go-sqlmail, it just sends queries to RDBMS and converts responses to format consumed by go-imap. If somebody wants to implement AWS/disk/etc storage - they probably should just implement go-imap interface, this will be much easier, especially since there is backendutil package which contains basically all logic you would need for implementation.

However, this all may be different for SMTP backend. I haven't looked into it but go-smtp may be a little bit too low-level. But in this case I'm going to contribute implementation of higher-level interfaces into go-smtp instead of keeping them here.

I had some ideas about moving message contents to external non-SQL storage, while keeping metadata in DB, but this is not the priority because popular RDBMSs handle large blobs in rows well enough.

@foxcpp
Copy link
Owner

foxcpp commented Mar 4, 2019

If you don't have any other questions - close the issue.

@xeoncross
Copy link
Author

Thanks for your reply and explanation. Keep up the good work!

@sapiens-sapide
Copy link

Hello @foxcpp @xeoncross,

FYI I'm also digging a way to store only relevant data into db and leave raw mail to filesystem or object store (because it's not a good idea to store huge emails in SQL db).

I'm amazed by the job made by @emersion, and by you @foxcpp, and like to join the journey !
Here is my point of view, — and experience — :
all you need to handle emails are 3 capabilities:

  1. an object store to handle raw email (could be FS or S3)
  2. a database to handle meta data about emails and basic search capabilities.
  3. a full text engine to handle more advanced searches.

At this stage, and as far as I understand, go-imap backend interfaces may be too broad to easily build different storage solutions on top.

To go further, I think we should agree on :

  • an extended model to store relevant meta-data in db : goal is to satisfy most IMAP (and SMTP) commands, excepting those that require to fetch full email body. (think how Dovecot index works)
  • an extended IMAP-backend interface that should define all the necessary funcs a store must implement regarding IMAP commands, or command sets (mailboxes, messages, search, idle, etc…). Note that authentication and users management should not fall into this interface.
  • similary : a SMTP-backend interface that should define all the necessary funcs a store must implement regarding STMP commands.

Thus, we'll build different backend solutions that all agree and rely on common interface sets, but with different implementations or storage paradigms.

I've write down a schema to go in this direction and I'd definitely share it with you if you think it's a good direction.

Stan.

@foxcpp
Copy link
Owner

foxcpp commented Mar 13, 2019

Primary design goal of go-sqlmail is to provide storage backend for maddy. maddy is currently just nothing but a proxy, but I posted some ideas here: foxcpp/maddy#15. @sapiens-sapide., you may want to join discussion since we are moving in direction of interface splitting and so your work can be useful here.

I overlooked some design issues when I started development of go-sqlmail, so now I think it is a good idea to support different storage for metadata and body.

@foxcpp foxcpp reopened this Mar 13, 2019
@foxcpp
Copy link
Owner

foxcpp commented Mar 25, 2019

It doesn't look like we have any progress here. So I will document my own ideas on how I would design it.

First, we abstract away "object store" using the following interface (we use io.Reader because if underlying storage supports streaming API - it will be useful to us).

type ObjectStore interface {
  Store(key string, data io.Reader) error
  Load(key string) (io.ReadCloser, error)
  Delete(key string) error
}

Second, add the ObjectStore field to imap.Opts{}. Add body_ext column to msgs table.
If body_ext is 1 - use ObjectStore with contents of body column as a key. This allows performing simple migration. If ObjectStore is set, all new messages will use it. Keys for ObjectStore are independent of IMAP UIDs so we will not have to rewrite them if we invalidate UIDs for whatever reason.

The ObjectStore key is not changed if we move messages. The key is not changed if we copy messages either since message contents are immutable in IMAP. But we also have to keep track of references to ObjectStore keys then. This is done via a separate table called ext_keys.

I'm not sure about how to interface with the FTS engine, however.

@foxcpp foxcpp self-assigned this May 5, 2019
@foxcpp foxcpp added this to the go-sqlmail v1 milestone May 5, 2019
@foxcpp foxcpp added the enhancement New feature or request label May 5, 2019
@foxcpp
Copy link
Owner

foxcpp commented May 5, 2019

I'm working on this. go-imap-sql 0.2 will support so-called "external store" interface. There will be maildir package implementing an external store using a Maildir-like structure. It will not used by default tho.

@foxcpp
Copy link
Owner

foxcpp commented May 7, 2019

The interface is here: https://github.com/foxcpp/go-imap-sql/blob/extstore/external_store.go.
Are we good to go and this issue should be closed or you have some concerns regarding design?

@sapiens-sapide
Copy link

I'm still working on another design in which mails are stored as 'raw' objects, — ie full bytes, including headers —, alongside mails' metadata and lexicon in a separate database and index. I think it's better to store the raw emails, without any logic embedded, even mbox reference.
It's still a WIP and I lack of time, so I'm sorry not being able to share anything for now… May be in few weeks.

@foxcpp
Copy link
Owner

foxcpp commented May 8, 2019

Take your time, we are not in a hurry. Your design may turn out to be better, so I guess I will wait for it instead of releasing go-imap-sql 0.2 with my experiment (I'm calling it this way becuase I'm not sure about how well it works).

@foxcpp
Copy link
Owner

foxcpp commented May 11, 2019

Okay, I stepped forward and simplified the "store" interface.

@foxcpp
Copy link
Owner

foxcpp commented May 12, 2019

Merged extstore into dev.

@foxcpp foxcpp closed this as completed May 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants