Modular design #15

foxcpp · 2019-03-10T19:19:34Z

--
Original post:

Let's say I configured IMAP endpoint as follows (where first line creates IMAP backend):

imap://127.0.0.1:1993 {
    sql sqlite3 maddy.db
}

Then I want to have SMTP endpoint that will deliver mail to same storage. How would I do this?

Assuming that backend also provides implementation of SMTP upstream (perhaps as separate object).

smtp://127.0.0.1:1025 {
	sql sqlite3 maddy.db
}

This approach creates another set of problems, because it now requires two separate backend/upstream objects to coordinate access to the same storage (think of IMAP unilateral updates).
Global variables? External IPC sockets? All this seems to be dirty solution.

What is we can create one "storage" object and associate it with multiple IMAP/SMTP endpoints?
This will transform "another set of problems" into just serialization of access to storage object. Which is easily solved by throwing some mutexes into it (or even without them, I haven't tested that but go-sqlmail backend object should be safe for concurrent use by multiple goroutines).

It also reduces resources usage (we will have only one SQLite "connection" page cache, for example)

Now I can imagine something like this:

backend sql arbitrary_name {
  driver sqlite3
  dsn maddy.db
}

imap://127.0.0.1:1993 {
  backend arbitrary_name 
  # of course this requires storage object to implement go-imap's Backend interface
}

smtp://127.0.0.1:1025 {
  backend arbitrary_name 
  # and also go-smtp's Backend here now.
}

What do you think?

The text was updated successfully, but these errors were encountered:

emersion · 2019-03-10T23:48:10Z

Yeah, +1 from me. I wonder if we need some kind of Storage interface in maddy. This may make things like unilateral updates easier to handle.

emersion · 2019-03-10T23:48:30Z

Also we need to abstract away authentication

foxcpp · 2019-03-11T13:40:27Z

Continuing idea of separation of backends from endpoints...

Module-based maddy design

Module concept

Each interface required by maddy for operation is provided by some object called "module".
This includes authentication, storage backends, DKIM, email filters, etc.
Each module may serve multiple functions. For example, go-sqlmail module could implement IMAP backend/storage, delivery to IMAP mailboxes (thus SMTP backend) and authentication. In order to use module you need to first create instance of it (read on).

Each module gets its own unique name (sqlmail for go-sqlmail, proxy for proxy module, local for local delivery perhaps, etc). Each module instance also gets its own (unique too) name which is used to refer to it in configuration. Both module and instance names are allowed to be any strings allowed in Caddyfiles without escaping (???, I don't know much about Caddyfile format, correct me here).

Endpoint listeners are modules too, they just don't implement any interface and just start listening on address from instance name after initialization.

Here is the most minimal interface for any module:

type Module interface {
  // Unique module name. Used in configuration and in logs.
  Name() string

  // Returns module version. May be printed to log and probably exposed to clients using extensions like IMAP ID.
  Version() string
}

type NewModule func(instName string, cfg caddyfile.???) error

And here is generic syntax for configuration:

module-name instance-name {
  module-configuration
}

Each block of this form creates new instance of module. Failure in module initialization (error returned by NewModule) is a fatal error and maddy will terminate after it.
Modules can refer to each other using application-global index. For example, when you specify auth. provider to use in endpoint's block by instance name, endpoint module can get instance object using this name.

'storage' interface

When you are storing information about email in modern world you definitely store it together with IMAP meta-data and in IMAP-friendly format. That's it. IMAP defines email storage structure. There is very small room for freedom. For this very reason we make "IMAP backend" and "storage" terms mean the same idea: Place where we can place emails and read it later. In this case, proxy is a kind of storage too: It stores email on a different server. To avoid confusion we will use only "storage" term in the future.

Basically, go-imap/backend.Backend interface with removed authentication. There are just GetUser(username string) backend.User instead.

IMAP extensions that modify/extend storage behavior or require knowledge of its state are handled here. These extensions are enabled on endpoint-level only if they are supported by 'storage' implementation.

'auth' interface

type AuthProvider interface {
  CheckPlain(username, password string) bool
}

'filter' interface

Used in SMTP pipeline to mutate or drop messages during processing.

type Filter interface {
  // Allowed to change body. If returns false - message is dropped.
  // opts are optional "context values" set in configuration, can be used to tweak
  // filter behavior on per-message basis.
  // if it returns non-nil error - message is dropped
  Apply(ctx *DeliveryContext, body *bytes.Buffer) error
}

Here DeliveryContext is a structure that contains basic information about SMTP client, SMTP envelope information (FROM, RCPT), opts set in configuration and arbitrary values that may be set by other filters.

'delivery' interface

Basically the same as filter except it is not allowed to change anything.

type Delivery interface {
  Deliver(ctx DeliveryContext, body bytes.Buffer) error
}

'imap' module

Configuration options: auth - sets auth provider to use, storage - sets storage backend to use.
Etc, tls, blah-blah.

SMTP pipeline

SMTP doesn't creates any restrictions on how we can process email. SMTP is basically "I give you that message and I want it to be seen by Alice and Bob, do whatever you need to get this done".

So we define "SMTP pipeline" concept here: Sequence of module instances that can transform messages how they want or probably save it somewhere or send it to a different server or all this at once. This allows users to construct infinitely complex chains to describe any logic they need.

There are several variables (like tls) and set of possible pipeline steps (described below).
auth variable sets auth provider to use. Pipeline steps are applied in order in which they are defined in config.

'filter' step applies instance_name filter to message, passing specified opts as first argument.

filter <instance_name> [opts]

'delivery' step pushes message to instance_name SMTP backend and continues processing (this is necessary to correctly support multiple recipients both local and remote).

delivery <instance_name> [opts]

Pipeline steps wrapped in match block run only if condition of match block matches.

match value-name pattern {
  other-pipeline-items
}

Adding no between value-name and match inverts condition.

Value-name can be one of these:

rcpt-domain - recipient's domain
rcpt - recipient's email
from - sender email
src-ip - IP of connected client
src-hostname - FQDN as reported by connected client in EHLO/HELO
Pattern is not actually a pattern by default and value should just be equal to it. If you wrap it with forward slashes /like that/ then it is interpreted as regexp and partial match is enough.
For values with multiple possible values (rcpt, rcpt-domain) only one match is required.
TBD: Is it enough for most common cases?

Obviously, this stops processing:

stop

Continue processing if client is logged in anonymously or using account from auth. provider. Otherwise - send "access denied" error and stop.

require-anonymous-auth

Require successful authentication using account from set auth. provider (auth variable).

require-auth

Config example

Here is example of complete IMAP+SMTP server configuration:
(again, I'm very not sure about how much of this is allowed by Caddyfile syntax, correct me if I'm wrong)

# implements 'delivery', 'storage' and 'auth' interfaces.
sqlmail sqlstorage {
  driver sqlite3
  dsn /var/lib/maddy/maddy.db
}

# implements 'filter' interface
dkim dkim {
  public filepath
  private filepath
}

imap 0.0.0.0:993 {
  # Configuration variable
  tls auto
  
  auth sqlstorage
  storage sqlstorage
}

smtp 0.0.0.0:25 {
  # Configuration variables
  tls auto
  auth sqlstorage

  # SMTP pipeline definition.
  filter verify-hostname
  match rcpt-domain emersion.fr {
    require-anonymous-auth 
    filter dkim verify=true
    delivery sqlstorage
  }
  match no rcpt-domain emersion.fr {
    require-auth
    filter dkim sign=true
    delivery outgoing-queue
  }
}

This is nowhere complete proposal, just dumping some ideas for discussion. Any questions, additions or related ideas?

xeoncross · 2019-03-13T14:45:59Z

I would love to see support for a flexible pipeline as not all SMTP servers deliver emails to an end client via IMAP. Here is the main use-case I have been working on:

SMTP-to-_________ Gateway. This might be a HTTP postback/webhooks, NATS message queue, or logging system. The idea is that data is consumed from regular emails and pushed into a non-email system.

This would also allow many creative types of email delivery (IPFS storage, encryption, posting articles to a blog) in addition to queued processing for automated systems (logging, ticket creation, NLP, etc...)

If storage is flexible, and the pipeline chain-able, I can parse a streaming MIME message body in a memory safe way, encrypt the payloads, store it on S3 and further process it from another system.

foxcpp · 2019-03-13T17:56:46Z

@emersion, I would like to know your opinion on proposed design. I think I'm done with basic ideas.

Will start experimenting with implementation of ideas stated above in my maddy fork.

foxcpp/maddy#15

foxcpp · 2019-03-15T15:53:16Z

Alright, here we are hit by limitations of Caddyfile format.
We can't describe proposed SMTP pipeline configuration with it.
No nested blocks, repeated directives require a lot of crunches to be parsed at all.

What should we do? I would really like to switch to a different config format but I think it diverges too much from emersion's ideas about Caddy-like server.

emersion · 2019-03-16T17:56:38Z

The overall goal of the configuration format is to keep it as simple as possible while still allowing for more complex (not too complex) scenario. An example would be your smtp pipeline proposal: it's already difficult to configure the basic "put it in the IMAP storage and don't bother me" setup.

I'd also like to make things secure by default: no need to configure complex pipelines to get DKIM.

A problem with that is loosing customizability. Thoughts? I'll try to think of a better approach.

Your current approach looks pretty reasonable regardless. I think it'd be best to experiment with it and adjust it as needed. Here are a few more minor comments:

We can probably simplify dkim dkim { blocks to just dkim { (this is a DKIM block with an empty name).

Deliver(ctx DeliveryContext, body bytes.Buffer) error

I'd prefer to use streaming interfaces (io.Reader).

Sorry for taking so long to give feedback. While I'm pretty busy with IRL stuff right now (moving to a different country), I'd like to contribute too. Things will likely slow down in the next days/weeks.

If you want, you can join the ##emersion channel on Freenode to discuss.

emersion · 2019-03-16T17:57:19Z

I would really like to switch to a different config format but I think it diverges too much from emersion's ideas about Caddy-like server.

I'm fine with using a different parser btw. Caddy's is tedious to use imho.

foxcpp · 2019-03-16T19:11:30Z

The overall goal of the configuration format is to keep it as simple as possible while still allowing for >more complex (not too complex) scenario. An example would be your smtp pipeline proposal: it's >already difficult to configure the basic "put it in the IMAP storage and don't bother me" setup.
I'd also like to make things secure by default: no need to configure complex pipelines to get DKIM.

I guess we can have reasonably default pipeline configuration while still allowing user to redefine it if they are ok with increased complexity. Also we can get default set of backends (say, go-sqlmail with sqlite3 configured to store stuff at /var/maddy/messages.db).

smtp 0.0.0.0:25 {
  hostname emersion.fr
}

Expanding to something similar to what I shown as full config example.

This approach preserves full flexibility while making maddy almost zero-configuration.

We can probably simplify dkim dkim { blocks to just dkim { (this is a DKIM block with an empty name).

Except that it probably should be DKIM instance with "dkim" name because instance names should be unique.

I'd prefer to use streaming interfaces (io.Reader).

Probably we can pass io.Reader and io.Writer to filters.

foxcpp · 2019-03-16T20:23:00Z

Support for additional SASL authentication methods

Used authentication module should implement at least plaintext authentication.
Additionally it may implement additional interfaces for other authentication methods.
If auth. module implements additional interface known to maddy - it will expose corresponding auth. method to clients (AUTH=METHOD capability for IMAP, for example).
Something like that for XOAUTH2:

type OAuthAuth interface {
  CheckOAuth(username, token string) bool
}

Default configuration

Unless user explicitly specifies auth. module instance to use we try to use default-auth or default (in that order).
Same goes for IMAP storage backend.

SMTP pipeline

If user didn't specified custom pipeline (no pipeline steps declarations in server block) and specified hostname - we use default pipeline that does the following:

Verifies that FQDN, IP and rDNS of the connected client match.
Applies DKIM verification for incoming messages.
Probably does something with SPF
Delivers messages with domain equal to our hostname to default-storage or default.
Adds DKIM signatures for outgoing messages.
Passes messages with recipients other than local ones to message queue (also defined as a module, default-queue or default).

Default modules

Unless overridden by a user, maddy adds default module that implements authentication, email storage and delivery target (perhaps go-sqlmail? :)). So you can then literally specify hostname and have it just work.

emersion · 2019-03-16T20:25:32Z

Probably does something with SPF

SPF is gross (doesn't handle relaying). Maybe we should just drop it?

We could add DMARC checks though.

foxcpp · 2019-03-16T20:52:20Z

We need a collection of use-cases to check how well our design (and more importantly -- config structure) works for them.

@xeoncross, @sapiens-sapide, any thoughts?

foxcpp · 2019-03-16T20:52:47Z

Probably does something with SPF

SPF is gross (doesn't handle relaying). Maybe we should just drop it?

We could add DMARC checks though.

Sure.

foxcpp · 2019-03-17T14:26:54Z

It is relatively easy to use caddyfile lexer to parse config into tree structure: https://hastebin.com/ovozofiweh.go

So I guess our problem with configuration format is solved.

sapiens-sapide · 2019-03-17T21:28:28Z

Adds DKIM signatures for outgoing messages.

This implies to set a DNS record for DKIM. I'm not sure if it's good to sign outgoing messages by default if DKIM signature can't be verified by peers because record is missing.
At least, a message should be printed out to user showing the TXT record that should be set in DNS zone.

emersion · 2019-03-17T21:54:38Z

The user will need to setup a bunch of records anyway (MX, DMARC, MTA-STS, etc). I think @foxcpp's idea of an embeddable zone file could solve this issue.

xeoncross · 2019-03-18T18:09:50Z

The user will need to setup a bunch of records anyway (MX, DMARC, MTA-STS, etc)

Not if maddy runs a DNS server itself. I've been looking at adding a crippled DNS to projects using different Go libraries and it seems pretty due-able. Simply set the domain DNS to point to the same box, then maddy only replies to requests for [domains here] and provides the needed TXT, MX, etc.. records.

We need a collection of use-cases to check how well our design (and more importantly -- config structure) works for them.

I have little interest in a configuration file for the use-cases I mentioned above. I would like to use maddy programmatically wiring in pipelines on a per-project bases. Then again, I see maddy not as a simple MTA/MDA/MSA, but as a powerful library to add a full SMTP/IMAP server to other projects.

The benefit is a single binary / process which also runs a HTTP server, slack bot, queue client, etc..

foxcpp · 2019-03-18T19:04:03Z

@xeoncross I guess "maddy as a library" is not going to be the top-priority use-case to support. At first, we want to make "simple MTA/MDA/MSA" but only then a generic IMAP/SMTP framework.

emersion · 2019-03-18T19:42:53Z

If you want libraries, you can already use go-imap, go-smtp et al.

Maddy could become a DNS server, but just generating a zone file as @foxcpp suggested is probably better. We could always think again about it if there are issues with this approach.

Reasons are explained here: #15 (comment)

foxcpp mentioned this issue Mar 11, 2019

go-imap-sql v1.0 status foxcpp/go-imap-sql#2

Open

11 tasks

foxcpp changed the title ~~Share storage backends between IMAP and SMTP~~ Modular design Mar 11, 2019

foxcpp mentioned this issue Mar 13, 2019

Domain design for backend storage? foxcpp/go-imap-sql#1

Closed

foxcpp added a commit to foxcpp/go-imap-sql that referenced this issue Mar 14, 2019

Implement modular maddy interfaces

3efac94

foxcpp/maddy#15

foxcpp mentioned this issue Mar 14, 2019

Implementation of modular design #17

Merged

emersion closed this as completed in #17 Mar 30, 2019

emersion pushed a commit that referenced this issue Mar 30, 2019

Replace caddyfile parser with custom one

1a738d1

Reasons are explained here: #15 (comment)

foxcpp added this to the 0.1 milestone May 27, 2019

foxcpp mentioned this issue Feb 10, 2020

Sharing groups of module instances (configuration blocks), getting rid of parser-level config macros #195

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modular design #15

Modular design #15

foxcpp commented Mar 10, 2019 •

edited

Loading

emersion commented Mar 10, 2019 •

edited

Loading

emersion commented Mar 10, 2019

foxcpp commented Mar 11, 2019 •

edited

Loading

xeoncross commented Mar 13, 2019 •

edited

Loading

foxcpp commented Mar 13, 2019

foxcpp commented Mar 15, 2019 •

edited

Loading

emersion commented Mar 16, 2019

emersion commented Mar 16, 2019 •

edited

Loading

foxcpp commented Mar 16, 2019

foxcpp commented Mar 16, 2019 •

edited

Loading

emersion commented Mar 16, 2019

foxcpp commented Mar 16, 2019

foxcpp commented Mar 16, 2019

foxcpp commented Mar 17, 2019

sapiens-sapide commented Mar 17, 2019 •

edited

Loading

emersion commented Mar 17, 2019

xeoncross commented Mar 18, 2019 •

edited

Loading

foxcpp commented Mar 18, 2019

emersion commented Mar 18, 2019 •

edited

Loading

Modular design #15

Modular design #15

Comments

foxcpp commented Mar 10, 2019 • edited Loading

emersion commented Mar 10, 2019 • edited Loading

emersion commented Mar 10, 2019

foxcpp commented Mar 11, 2019 • edited Loading

Module-based maddy design

Module concept

'storage' interface

'auth' interface

'filter' interface

'delivery' interface

'imap' module

SMTP pipeline

Config example

xeoncross commented Mar 13, 2019 • edited Loading

foxcpp commented Mar 13, 2019

foxcpp commented Mar 15, 2019 • edited Loading

emersion commented Mar 16, 2019

emersion commented Mar 16, 2019 • edited Loading

foxcpp commented Mar 16, 2019

foxcpp commented Mar 16, 2019 • edited Loading

Support for additional SASL authentication methods

Default configuration

SMTP pipeline

Default modules

emersion commented Mar 16, 2019

foxcpp commented Mar 16, 2019

foxcpp commented Mar 16, 2019

foxcpp commented Mar 17, 2019

sapiens-sapide commented Mar 17, 2019 • edited Loading

emersion commented Mar 17, 2019

xeoncross commented Mar 18, 2019 • edited Loading

foxcpp commented Mar 18, 2019

emersion commented Mar 18, 2019 • edited Loading

foxcpp commented Mar 10, 2019 •

edited

Loading

emersion commented Mar 10, 2019 •

edited

Loading

foxcpp commented Mar 11, 2019 •

edited

Loading

xeoncross commented Mar 13, 2019 •

edited

Loading

foxcpp commented Mar 15, 2019 •

edited

Loading

emersion commented Mar 16, 2019 •

edited

Loading

foxcpp commented Mar 16, 2019 •

edited

Loading

sapiens-sapide commented Mar 17, 2019 •

edited

Loading

xeoncross commented Mar 18, 2019 •

edited

Loading

emersion commented Mar 18, 2019 •

edited

Loading