Emacs Plugins in OCaml: Ecaml API Overview (part 3)

This time, we’ll take a look at the Ecaml modules used to interact with different components of Emacs. These modules define type-safe interfaces that represent buffers, font faces, and various Elisp data structures like hash tables and vectors.

This post reflects the state of the Ecaml library as of version v0.9.115.24+69 (git commit 25a9f825). It is, of course, being actively developed and improved upon, and some details may change between when this post was written and now, when you are reading it.

This post is part 3 of a series (prev/next). The full code is available on GitHub.

1 Intro

This post won’t cover everything in the Ecaml library, since it’s quite big1, but it should help give you a general idea for how it’s organized. Most of the individual functions correspond to one or two Elisp functions for which more information can be found by reading the corresponding section of the Emacs Lisp manual.

For example, Merlin reports the following type and documentation comment for Ecaml.Buffer.find, which links to the corresponding Emacs Manual section:

(* [find ~name] returns the live buffer whose name is [name], if any.
   [(describe-function 'get-buffer)]. *)
val find : name:string -> Buffer.t option

Evaluating (describe-function 'get-buffer), either manually or by invoking describe-function interactively (C-h f or SPC h d f for Spacemacs) brings you to the corresponding documentation:

get-buffer is a built-in function in ‘C source code’.

(get-buffer BUFFER-OR-NAME)

Return the buffer named BUFFER-OR-NAME. BUFFER-OR-NAME must be either a string or a buffer. If BUFFER-OR-NAME is a string and there is no buffer with that name, return nil. If BUFFER-OR-NAME is a buffer, return it as given.

The correspondence between the behavior of Ecaml.Buffer.find and get-buffer should be apparent: find returns Some buffer if there exists a buffer with the given name, otherwise None.

2 The Ecaml module

The top-level module of the Ecaml library is named, unsurprisingly, Ecaml. It contains all of the other modules for interacting with more specific areas of Emacs functionality, as well as some commonly used functions directly in the Ecaml module:

module Ecaml : sig
  module Advice = Ecaml__.Advice
  (* snip *)
  module Working_directory = Ecaml__.Working_directory
  val defadvice :
    ?docstring:string ->
    ?position:Advice.Position.t ->
    Lexing.position ->
    advice_name:Advice.Name.t ->
    for_function:Symbol.t ->
    (args:Value.t list -> inner:(Value.t list -> Value.t) -> Value.t) -> unit
  val defun : (Symbol.t -> Function.Fn.t -> unit) Function.with_spec
  val defcustom :
    Lexing.position ->
    Symbol.t ->
    Ecaml.Customization.Type.t ->
    docstring:string ->
    group:Ecaml.Customization.Group.t -> standard_value:Value.t -> unit
  val defvar :
    Lexing.position -> Symbol.t -> Value.t -> docstring:string -> unit
  val define_derived_mode :
    ?parent:Ecaml.Major_mode.t ->
    Lexing.position ->
    change_command:Symbol.t ->
    docstring:string ->
    initialize:(unit -> unit) -> mode_line:string -> Ecaml.Major_mode.t
  val inhibit_messages : (unit -> 'a) -> 'a
  val message : string -> unit
  val message_s : Core_kernel.Sexp.t -> unit
  val messagef : ('a, unit, string, unit) Core_kernel.format4 -> 'a
  val provide : Symbol.t -> unit
  val inhibit_read_only : (unit -> 'a) -> 'a®nd

Let’s go through a couple of the most important functions directly under Ecaml:

2.1 defun

defun allows you to define a named function, visible to any Elisp code, but whose body consists of an implementation in OCaml.

Its full signature, with the Function.with_spec type alias spelled out, is:

val defun
  :  ?docstring:string
  -> ?interactive:string
  -> ?optional_args:Symbol.t list
  -> ?rest_arg:Symbol.t
  -> Lexing.position -> args:Symbol.t list -> Symbol.t
  -> Function.Fn.t -> unit

That’s a lot to unpack, so we’ll take the arguments one at a time:

docstring
a string that is treated as a documentation comment (hence the name) for the function you’re defining. In Lisps, this is a string literal that comes directly after the argument list in a function definition, and basically describes what the function does. It’s a string instead of a comment because Lisp systems usually allow you to retrieve these strings at runtime. Docstrings appear in the text of describe-function when you do C-h f in Emacs, so they’re very useful for functions that are intended to be invoked by a user and not a program.
interactive
interactive is a special form in Elisp that turns a function into a command. It’s really complicated so I shan’t describe it here, but it basically does some magic so that you can prompt the user for input and your function receives that input as arguments.
optional_args and rest_arg
names for arguments that will slurp up the rest of the arguments passed to the function, when applicable. They don’t actually change the behavior of the function, but they will appear in the documentation for describe-function so you should still pick good names.
Lexing.position

defun requires a Lexing.position argument, which describes where in your OCaml code the function was defined. This shows up in the describe-function output so that you can easily figure out where the OCaml implementation lives.

ppx_here, a syntax extension, provides a handy shortcut for specifying the current source location: simply write [%here] where you need a Lexing.position.2, 3

args
like optional_args and rest_arg, just provides names for the formal parameters for your function. See above.
Symbol.t
the name of your function. This is what it’ll be referred to as by Emacs, so if you pass the symbol foo, any Elisp code that contains, say, (foo), will call your function.
Function.Fn.t
the body of your function. It’s the OCaml code that does the heavy lifting. It must be a value of type Ecaml.Value.t array -> Ecaml.Value.t, i.e., it should accept some number of arguments and return a valid Elisp value. Unfortunately, there’s no way for the type-checker to guarantee that you’ll get the right number of arguments (this is basically a Lisp function, after all), so you will have to handle any arity errors.

Here’s an example of a simple function being defined using defun.

open Ecaml

let () =
  defun [%here] (Symbol.intern "say-hello")
    ~optional_args:[ Symbol.intern "name" ]
    ~args:[]
    (function
      | [| name |] ->
        let name =
          if Value.is_nil name
          then "World"
          else (Value.to_utf8_bytes_exn name)
        in
        Value.of_utf8_bytes ("Hello, " ^ name ^ "!")
      | _ -> invalid_arg "wrong arity")
;;

let () = provide (Symbol.intern "ecaml-bf")

The way Elisp treats optional arguments is not, as you might guess, that the function might receive zero or one argument, but rather that the name argument might be nil if no argument was provided. We check that to determine whether to use a default value, and then return an appropriate greeting.

We can test it like so:

alias eb="emacs -Q -L _build/default/src --batch"
eb --eval "(require 'ecaml-bf)" --eval '(print (say-hello))'
# "Loaded Ecaml."
#
# "Hello, World!"
eb --eval "(require 'ecaml-bf)" --eval '(print (say-hello "Emacs"))'
# "Loaded Ecaml."
#
# "Hello, Emacs!"

If you’re familiar with Elisp, it might be helpful to see how corresponding Ecaml and Elisp code compare, so here’s the above example written in Elisp.

(defun say-hello (&optional name)
  (let ((name (or name "World")))
    (concat "Hello, " name "!")))

Okay, so it’s a lot shorter :sweat:. But that’s okay, because most of the code we write won’t be just tons of boilerplate wrapping trivial OCaml functions. Instead, the point of Ecaml is to let OCaml do the heavy lifting, so we only need to define the interface for our plugin using defun and then we can hack on whatever interesting functionality we want to provide.

2.2 defvar and defcustom

These functions allow you to define Elisp variables4, similarly to what defun does for functions. The difference between them is that defcustom defines a customizable variable. The Elisp manual, section 14.3:

“Customizable variables”, also called “user options”, are global Lisp variables whose values can be set through the Customize interface. Unlike other global variables, which are defined with ‘defvar’ (*note Defining Variables::), customizable variables are defined using the ‘defcustom’ macro. In addition to calling ‘defvar’ as a subroutine, ‘defcustom’ states how the variable should be displayed in the Customize interface, the values it is allowed to take, etc.

So, what does that mean for your plugin? Well, basically it just means that the variable can be set and queried interactively by the user using the Emacs Customize interface.

2.3 message and co.

val message : string -> unit
val message_s : Core_kernel.Sexp.t -> unit
val messagef : ('a, unit, string, unit) Core_kernel.format4 -> 'a

The next three functions are simply various ways of printing messages to the echo area. The echo area is the tiny area at the very bottom of the Emacs window, where messages will appear, such as “(No changes needed to be saved)” when you try to save a file you already just saved.

The Elisp function message displays a new message in the echo area. Messages are also appended to the special *Messages* buffer, so you can view that buffer to see any messages you may have accidentally dismissed.

message and message_s accept a string and a Sexp.t5, respectively. (Sexp.t is from Jane Street’s sexplib.)

messagef also prints to the echo area but it accepts a formatting string and arguments the same way printf does, e.g.,

let () = messagef "the answer is %d" 42

2.4 provide

provide registers a feature with Emacs, basically telling Emacs that the plugin was successfully loaded. This is needed in order to load the plugin using require (or Emacs will complain). It also prevents Emacs from trying to load the same plugin again, since require checks to see if its argument has been registered as a feature before loading a plugin. Call provide when your plugin is finished setting up.

3 Feature rundown

What follows is a listing of the current modules in Ecaml (as of version v0.9.115.24+69), each with a brief summary of the functionality provided within.

Many of the modules correspond one-to-one with concepts in Emacs better-documented in the manual than I can do here. Module names are linked to the appropriate section of the Emacs or Emacs Lisp manuals.

Advice
Elisp has a system for “advising” functions, ways of adding to or modifying the behavior of a function without completely redefining it by writing a new function that is called before, after, or in place of the old one, etc. Ecaml currently only supports around advice.
Ansi_color
Contains functions for interpreting ANSI color escape sequences in Elisp strings and translating them into Elisp string properties which encode equivalent colors. Such strings might often be the result of running terminal-oriented version control or diff programs.6
Auto_mode_alist
Manages the variable auto-mode-alist, which determines how Emacs decides what major mode to open a file in, based on the filename.
Backup
Manages the variable make-backup-files, which controls whether Emacs will make backup files (like foo.c~) when you edit files.
Buffer
Everything to do with buffers, including killing buffers, displaying them, and finding out what files they’re visiting.
Char_code
Internally, Emacs represents characters as just a code point (integer). Technically, these are a superset of Unicode code points.
Color
Deals with colors and Color Names, which are various forms of plain English and RGB strings that Emacs can interpret as colors and display. See M-x list-colors-display for a list of available colors.
Command
A command is simply a function that can be called interactively, e.g., through M-x. It should specify a way of receiving its arguments interactively, such as by prompting the user for input, rather than through the normal function call mechanism.
Comment
Manages the active comment syntax.
Compilation
Not sure what this does.
Current_buffer
Emacs has a notion of the current buffer, which many Elisp functions operate on by default instead of accepting a buffer or buffer name as an argument. This module allows you to set the current buffer and use those functions.
Customization
Manages the definition and organization of customization items, which allow users to customize variables and (font) faces through an organized, interactive user interface.
Directory
Functions for managing file directories.
Echo_area
message and co. print their messages here, at the bottom of Emacs’s frame. Also allows you to temporarily inhibit messages in the echo area (but they’re still logged to *Messages*).
Face
Emacs’s notion of fonts, including font families, sizes, weights, styles, and decorations.7
Feature
Emacs records a set of named features provided by packages (they’re just symbols). Code that depends on a given feature can require it, which causes Emacs to load the package that provides it—unless it has already been loaded.
File
Functions for managing files (renaming, writing, permissions, etc.).
Filename
Functions for managing filenames (extensions, relative/absolute paths, etc.)
Find_function
find-function jumps to the source code where a function is defined, given its name. It’s not built-in as part of Emacs, but rather is defined by the Find Func package.
Form
Lisp treats all code as plain old data (cons cells and symbols and numbers—oh my!). Module Form contains functions specifically related to treating Lisp values as code (e.g., eval and quote).
Frame
Manages Emacs frames, which you probably call “windows” if you run GUI Emacs (as opposed to in the terminal). For Emacs’s windows, see below.
Function
Manages Elisp functions. Function.Fn.t is the common signature for all OCaml functions that are to be called from within Emacs.
Grep
Facility for running grep from within Emacs. grep is run asynchronously, and the results are collected in the *grep* buffer.
Hash_table
Elisp’s built-in hash table data structure.
Hook
Hooks identify logical places where you can register functions to be called. For example, you could attach a function to before-save-hook to remove trailing whitespace from a file you’re editing before you save it.
Input_event
Deals with user input events, such as mouse clicks and key presses. You could use Input_event.modifiers to determine whether the CTRL key was held down while a key was pressed.
Key_sequence
A sequence of one or more key presses that form a unit, such as C-x C-f. This module can be used to read key sequences from user input or simulate key presses using execute-kbd-macro.
Keymap
Keymaps relate input events to commands or other keymaps (allowing multiple input events to correspond to a single command). This is how keys are bound to commands, and we can use these to provide key bindings for any commands we provide in our plugin.
Load
Load another plugin, usually a Lisp file. Also contains path, which allows you to examine the load path, which is a list of directories where Emacs will look for the file name you pass to load.
Load_history

Best described by an excerpt from the doc comments:

(* [update_emacs_with_entries] updates [load-history] with the information supplied to
   [add_entry], which make it possible to, within Emacs, jump from a symbol defined by
   Ecaml to the Ecaml source. *)

Used behind-the-scenes by defcustom, defun, defvar, and so on.

Major_mode
Manages major modes, major mode keymaps, derived modes, etc.
Marker
Markers keep track of a certain position in a buffer. They are automatically adjusted when the buffer is edited, so that they maintain the same logical position if not the same offset.
Minibuffer
Often confused with the Echo Area (I used this term wrong in the first post of this series). Used for reading input from the user. For example, when you run an interactive command through M-x, the minibuffer is where you type the name of the command you want to run.
Minor_mode
Secondary modes you can enable and disable that provide additional functionality on top of the major mode.
Obarray
An internal Emacs data structure that stores a set of symbols, for use with intern and read. Normally there is only one, stored in the variable obarray, so there is only one symbol with a given name.
Point
The location of the cursor in a buffer. Module Point contains functions for setting the point and searching forward for a given string, for example.
Position
A number that describes the position of a character or cursor within a buffer. Starts at 1. Doesn’t automatically move when the buffer is edited, unlike a marker.
Process
Manages child process of Emacs, such as shells, ispell, or Merlin. This also includes the Emacs server.
Q
Contains a large number of symbols, including keyword symbols (module Q.K) and ampersand-symbols like &optional (module Q.A). These are all stored here to avoid unnecessarily allocating OCaml data structures to refer to these symbols whenever they’re needed.
Regexp
Elisp regular expressions. They have slightly different syntax from regular expressions you might use in Perl or Perl-compatible regexps.8
Selected_window
Similar to Current_buffer, but for Emacs’s “windows”.
Symbol

Identifiers in Elisp are converted to symbols, which are “interned”. This means that two symbols with the same name9 are physically the same object, and so can be compared using pointer arithmetic. Module Symbol also contains functions for calling Elisp functions from OCaml, with convenience functions for different arities.

For example, Symbol.funcall1 accepts a symbol, which should denote the name of a function, and a single argument, which is passed to the function. Functions in this module whose names end in _i ignore the return value and return unit instead of Value.t. These are useful since a lot of Elisp functions with side effects don’t return anything useful.

Most of the other modules’ functionality are built on top of Symbol and its function-calling functions.

Syntax_table
Syntax tables are used to provide language-specific functionality, e.g., syntax highlighting.
System
Interacts with the operating system. Currently contains functions for setting and querying environment variables.
Text
Represents Emacs’s strings, which contain not only characters but also text properties, which enhance text to provide everything from (font) faces to read-only status to text-based “buttons”. See the manual for more on text properties.
Timer
Timers allow you to schedule a function to be run at a future time, possibly more than once.
User
Manages operating system users, such as their login names and UIDs.
Value
Lisps are dynamically typed, and every type of value is ultimately a subtype of Value.t. Contains a lot of type predicates that allow you to test the type of an arbitrary value.
Var
Special features available to variables, such as setting a buffer-local value or a default value.
Vector
Elisp’s built-in vector data structure.
Window
Confusingly, Emacs uses the term window to refer to what most people might call window panes. A window displays a buffer.
Working_directory
Each buffer, as well as Emacs itself, has a working directory. Relative paths are resolved relative to this directory. Working_directory.within runs a function with the current working directory set to a given value.

4 Phew!

That was quite a few modules. We’ll use some of them next week in building our interpreter plugin. Catch you next time.

Footnotes:

1

Plus, I have no idea what many of the individual pieces do.

2

ppx_jane, Jane Street’s set of ppx rewriters, includes ppx_here, so if you already have that installed, you can use it to provide [%here]. You’ll need to add the following to your jbuild file:

(jbuild_version 1)

(executables
 ((names     (main))
  (libraries (ecaml))
  (preprocess (pps (ppx_jane))))) ; add this line
3

If you’re not keen on using syntax extensions, you can use OCaml’s built-in __POS__ macro. Unfortunately, this macro returns a tuple instead of a record, but since all the fields are in the right order, you can do Obj.magic __POS__ to get a Lexing.position. But don’t tell anyone I told you that!

4

Elisp is a Lisp-2, meaning that functions and variables live in two different and non-overlapping namespaces. That’s why defining functions and variables uses different mechanisms (unlike in, say, OCaml!).

5

See? Interfacing OCaml with Lisp isn’t that weird after all!

6

In fact, this was one of the key original motivations for Ecaml.

7

Surprisingly, faces in Elisp live in a completely separate namespace from variables and functions. So perhaps Elisp is a Lisp-3?

8

For example, the grouping operator is \( ... \), not ( ... ). Plain old parentheses just match literal parentheses in the target string.

9

in the same obarray.

Share

Comments