Safe JavaScript Module Plan

kate_sills · March 21, 2019, 8:41pm

** Edit as of March 21: our plan has changed somewhat from the original plan, and we’ve moved the document to a github repo. Updated description below:

Mark S. Miller, Darya Melicher, JF Paradis and I have been writing up a plan for Safe JavaScript Modules (old link to google doc). It’s still a work in progress, but please feel free to comment or make suggestions. The basic idea is this:

Separate the modules into “pure” and “resource” modules. There’s a full definition in the doc, but basically, pure modules don’t have side effects, don’t encompass system resources or resource modules, and don’t contain or transitively reference any mutable state. Resource modules are everything else. fs in Node.js, for example, would be a resource module.
Use Realms and SES to load each resource module into its own compartment. Pure modules are much safer and can be loaded together in the root realm.
Write attenuating modules to limit the authority of the modules to only what is necessary. In the example that we use in the document, the package supports-color imports the built-in os module, but it doesn’t really need most of the features of os, like the ability to set the scheduling priority of any process (!). It only uses the os module to get the current platform and release. Therefore, the attenuating module will provide an os to supports-color that is limited to just giving the current platform and release and not all of the other abilities.
If necessary, use a package manifest similar to package.json to describe the relationships among modules (including the attenuating modules) in a declarative manner.

kate_sills · January 12, 2019, 3:15am

We’ve simplified our approach in a major way. Now, there are only manifests and authority-handling modules that can subdivide, attenuate, or virtualize an application’s authority. I’m rewriting the doc now in light of this, but I can say it’s much easier to understand.

dckc · March 5, 2019, 1:52am

Refactoring code to adhere to ocap discipline is quite straightforward. How about rather than introducing a new manifest format and training people to use it, we just train people to use ocap discipline?

For example, instead of letting a module use ambient authority (even attenuated) like this …

const fs = require('fs');
const attenuateFs = require('attenuate-fs');
 
const altFs = attenuateFs(fs);

… we just pass the attenuated fs explicitly as an argument to whatever functions need it:

const addTodoToFile = (fs, todo, priority='Medium') => {

kate_sills · March 5, 2019, 1:58am

I agree. That’s the approach we’re advocating for (the functional approach in the doc) but for legacy code where we can’t make someone rewrite it, the wiring is described in a manifest (the legacy approach). I think we’re trying to use the manifest as little as possible (if someone can write all of their dependencies in the way you describe, they won’t have one at all!). I wonder if there’s a way to do away with it altogether, even for legacy code.

saleh · March 5, 2019, 11:30pm

Interesting you raise that point, I will try not to plug in Dynamic Namespaces here, but if we leave it aside and just consider what I was trying explore before that in my modules alpha experiment, I think we can avoid rewriting code all together.

In modules alpha, I was trying to use refinements to with evaluations I had done previously for first-class math globals which borrowed from the vast improvements of the various iterations on realms shims to mimic ECMAScript module behaviours (just because I wanted to run such code where modules can’t go yet?!).

Having the benefit of seeing your code and taking part in our weekly discussions, it felt very intuitive that the model used to encapsulate in all those cases was providing two types of abstraction behaviours:

It controls inputs without changing the consumer’s semantics.
It optionally captures outputs or side-effects or both.

So arriving at this thought, a good question to ask here is: Are we mutating modules or merely mutating the input for a particular consumer which (this input) happens to be an output of module, but sometimes is just a scoped thing? I think that irrespective of optimization, and more importantly irrespective of the semantics implied by the syntax, from a purely OCAP notion, the thing mutated is always an input to the particular consumer.

Does this thought help?

Example from meeting

// @type {(attenuatedID, pureNS, attenuator) => undefined}
import attenuate from 'ses-magic';
import fs from 'fs';
import manifest from 'app/manifest.json';

attenuate(
	'alt-fs', // attenuatedID
	fs, // pureNS
	() => ({readFile: fs.readFile}), // creates attenuatedNS
);

attenuate.fromManifest(manifest);

saleh · March 13, 2019, 5:08pm

In Tuesday’s meeting, as we contrasted variants of the code in @kate_sills’s example, I tried to contrast the more flat abstract idea of a module with “theoretically” declarative attenuations. In my infinite ability to articulate confusion further, I showed example code which I was hoping would help .

So to give some context, I am working with some guarded assumptions when I trying to reason about SES in both CommonJS/ESM — I am carefully looking for ways to validate or drop those guarded assumptions when they merit or become challenged in our discussions.

I am assuming program code is separate from authority-handling code (not necessarily right or wrong at this point but a useful division to start with).
I am assuming both attenuation and program code execute synchronously (ie they retain the same execution flow that would unfold in @dckc’s example).
I am assuming that location or resolution of modules will occur asynchronously prior to the execution of any attenuations (ie an attenuated module exports undefined for all names in ESM until synchronous execution takes place in point 2 and only the expected authorities are exposed — that would not be the case for CommonJS which resolves and executes synchronously).

Based on the above, my theoretical examples tried to do away with the actual wiring and instead look at the information needed by a wiring-oriented ESM loader and how that would be expressed in a require function if CommonJS had a similar divide of asynchronous resolution before synchronous execution:

js example 3a

const fs = require('fs', ['readFile', 'readFileAsync']);

fs.readFileAsync ? fs.readFileAsync(...args)
  : fs.readFile ? fs.readFile(...args)
  : throw Error('Not authorized');

js example 3b

import ['readFile', 'readFileAsync'] as fs from 'fs'; 

// ie: import * 

fs.readFileAsync ? fs.readFileAsync(...args)
  : fs.readFile ? fs.readFile(...args)
  : throw Error('Not authorized');

js example 3c

import {readFile, readFileAsync} from 'fs';

readFileAsync ? readFileAsync(...args)
  : readFile ? readFile(...args)
  : throw Error('Not authorized');

If fs is the attenuated one, then the last example will require an explicit export const { readFile, readFileAsync } = {} which would make them exported const bindings with undefined values.

Edit: The attenuated ESM module would theoretically look like this:

import fs from 'fs';
import { createAttenuatedModule } from 'ses-module-magic';

export let readFile, readFileAsync;
export const { /* writeFile… */ } = {};

createAttenuatedModule(import.meta.url, fs, ['readFile', 'readFileAsync'])
  .then(fs => ({ readFile, readFileAsync } = {... fs}));

Zarutian · March 20, 2019, 5:44pm

Regarding idea nr 1. I have used unofficially and colliqually the terms modules for “pure” and powers for “resource”. Why powers? Well because authority is a bit too long syllable wise and it is closer to the Icelandic word völd (plural, singular: vald) which is more descriptive I hold forth.

However non power modules arent pure in the sense of pure functions as they could upon instanciation make state bearing objects internal to that module instance which precludes sharing those module instances between mutually untrusting Realms.

Regarding ideas 3, the issue I have regarding such declarive dependency graph description and inter module configuration is the namespace of module identifiers. (The string that was usually passed into require() or just after the ‘from’ part of an import statement.)

Too many times I have had the issue of indirectly requiring two diffrent versions of a module in an application. (Module A require one version of Module C to function while Module B requires a diffrent version of Module C.) I have usually dealt with such issues where the modules follow the commonjs by overriding the require function seen by the code of one of the modules.

Btw this kind of issues arise when the maintainers of such modules are not as active at keeping them up to date to the absolute newest version of their dependencies or the functionality or behaviours of their dependencies have changed in such way to no longer support features or methodology the dependent module requires to function.

Regarding idea 4, the “configurator”, I counter with the KeyKOS Factory pattern where each ‘top-level’ factory used by the application is free to ‘subcontract’ to other factories how certain aspects of the ‘top-level’ factory products are made.

danfinlay · March 21, 2019, 4:46pm

One issue that jumps out at me is this seems to only allow attenuation down to the function-reference level, but does not support further parameterization (Maybe this child module needs to only read this one file, on certain days of the week).

For very fine-grained attenuations like that, it’s hard for me to imagine something better than just passing a new, validated function to the module that needs attenuating.

kate_sills · March 22, 2019, 5:18pm

Is this in reply to Baldur’s concept?

danfinlay · March 22, 2019, 5:30pm

Oh Sorry, I didn’t use the reply feature correctly.

I was trying to address @saleh’s concept like const fs = require('fs', ['readFile', 'readFileAsync']);.