Emacs Plugins in OCaml: Ecaml API Overview (part 3)
Published: Last updated:· Tags: ecaml emacs ocaml · Category: ecaml-getting-started
This post is part 3 of a series (prev/next). The full code is available on GitHub.
This time, we’ll take a look at the Ecaml modules used to interact with different components of Emacs. These modules define type-safe interfaces that represent buffers, font faces, and various Elisp data structures like hash tables and vectors.
This post reflects the state of the Ecaml library as of version v0.9.115.24+69 (git commit 25a9f825). It is, of course, being actively developed and improved upon, and some details may change between when this post was written and now, when you are reading it.
Table of Contents
1 Intro
This post won’t cover everything in the Ecaml library, since it’s quite big1, but it should help give you a general idea for how it’s organized. Most of the individual functions correspond to one or two Elisp functions for which more information can be found by reading the corresponding section of the Emacs Lisp manual.
For example, Merlin reports the following type and documentation comment for
Ecaml.Buffer.find
, which links to the corresponding Emacs Manual section:
(* [find ~name] returns the live buffer whose name is [name], if any. [(describe-function 'get-buffer)]. *) val find : name:string -> Buffer.t option
Evaluating (describe-function 'get-buffer)
, either manually or by invoking
describe-function
interactively (C-h f or
SPC h d f for Spacemacs) brings you to the corresponding
documentation:
get-buffer is a built-in function in ‘C source code’.
(get-buffer BUFFER-OR-NAME)
Return the buffer named BUFFER-OR-NAME. BUFFER-OR-NAME must be either a string or a buffer. If BUFFER-OR-NAME is a string and there is no buffer with that name, return nil. If BUFFER-OR-NAME is a buffer, return it as given.
The correspondence between the behavior of Ecaml.Buffer.find
and
get-buffer
should be apparent: find
returns Some buffer
if there exists
a buffer with the given name, otherwise None
.
2 The Ecaml
module
The top-level module of the Ecaml library is named, unsurprisingly, Ecaml
.
It contains all of the other modules for interacting with more specific areas
of Emacs functionality, as well as some commonly used functions directly in
the Ecaml
module:
module Ecaml : sig module Advice = Ecaml__.Advice (* snip *) module Working_directory = Ecaml__.Working_directory val defadvice : ?docstring:string -> ?position:Advice.Position.t -> Lexing.position -> advice_name:Advice.Name.t -> for_function:Symbol.t -> (args:Value.t list -> inner:(Value.t list -> Value.t) -> Value.t) -> unit val defun : (Symbol.t -> Function.Fn.t -> unit) Function.with_spec val defcustom : Lexing.position -> Symbol.t -> Ecaml.Customization.Type.t -> docstring:string -> group:Ecaml.Customization.Group.t -> standard_value:Value.t -> unit val defvar : Lexing.position -> Symbol.t -> Value.t -> docstring:string -> unit val define_derived_mode : ?parent:Ecaml.Major_mode.t -> Lexing.position -> change_command:Symbol.t -> docstring:string -> initialize:(unit -> unit) -> mode_line:string -> Ecaml.Major_mode.t val inhibit_messages : (unit -> 'a) -> 'a val message : string -> unit val message_s : Core_kernel.Sexp.t -> unit val messagef : ('a, unit, string, unit) Core_kernel.format4 -> 'a val provide : Symbol.t -> unit val inhibit_read_only : (unit -> 'a) -> 'a end
Let’s go through a couple of the most important functions directly under
Ecaml
:
2.1 defun
defun
allows you to define a named function, visible to any Elisp code, but
whose body consists of an implementation in OCaml.
Its full signature, with the Function.with_spec
type alias spelled out, is:
val defun : ?docstring:string -> ?interactive:string -> ?optional_args:Symbol.t list -> ?rest_arg:Symbol.t -> Lexing.position -> args:Symbol.t list -> Symbol.t -> Function.Fn.t -> unit
That’s a lot to unpack, so we’ll take the arguments one at a time:
docstring
- a string that is treated as a documentation comment (hence
the name) for the function you’re defining. In Lisps, this
is a string literal that comes directly after the argument
list in a function definition, and basically describes what
the function does. It’s a string instead of a comment
because Lisp systems usually allow you to retrieve these
strings at runtime. Docstrings appear in the text of
describe-function
when you do C-h f in Emacs, so they’re very useful for functions that are intended to be invoked by a user and not a program. interactive
-
interactive
is a special form in Elisp that turns a function into a command. It’s really complicated so I shan’t describe it here, but it basically does some magic so that you can prompt the user for input and your function receives that input as arguments. -
optional_args
andrest_arg
- names for
arguments that will slurp up the rest of the arguments passed to the
function, when applicable. They don’t actually change the behavior of
the function, but they will appear in the documentation for
describe-function
so you should still pick good names. Lexing.position
-
defun
requires aLexing.position
argument, which describes where in your OCaml code the function was defined. This shows up in thedescribe-function
output so that you can easily figure out where the OCaml implementation lives.ppx_here
, a syntax extension, provides a handy shortcut for specifying the current source location: simply write[%here]
where you need aLexing.position
.2, 3 args
- like
optional_args
andrest_arg
, just provides names for the formal parameters for your function. See above. Symbol.t
- the name of your function. This is what it’ll be referred to
as by Emacs, so if you pass the symbol
foo
, any Elisp code that contains, say,(foo)
, will call your function. Function.Fn.t
- the body of your function. It’s the OCaml code that does
the heavy lifting. It must be a value of type
Ecaml.Value.t array -> Ecaml.Value.t
, i.e., it should accept some number of arguments and return a valid Elisp value. Unfortunately, there’s no way for the type-checker to guarantee that you’ll get the right number of arguments (this is basically a Lisp function, after all), so you will have to handle any arity errors.
Here’s an example of a simple function being defined using defun
.
open Ecaml let () = defun [%here] (Symbol.intern "say-hello") ~optional_args:[ Symbol.intern "name" ] ~args:[] (function | [| name |] -> let name = if Value.is_nil name then "World" else (Value.to_utf8_bytes_exn name) in Value.of_utf8_bytes ("Hello, " ^ name ^ "!") | _ -> invalid_arg "wrong arity") ;; let () = provide (Symbol.intern "ecaml-bf")
The way Elisp treats optional arguments is not, as you might guess, that the
function might receive zero or one argument, but rather that the name
argument might be nil
if no argument was provided. We check that to
determine whether to use a default value, and then return an appropriate
greeting.
We can test it like so:
alias eb="emacs -Q -L _build/default/src --batch" eb --eval "(require 'ecaml-bf)" --eval '(print (say-hello))' # "Loaded Ecaml." # # "Hello, World!" eb --eval "(require 'ecaml-bf)" --eval '(print (say-hello "Emacs"))' # "Loaded Ecaml." # # "Hello, Emacs!"
If you’re familiar with Elisp, it might be helpful to see how corresponding Ecaml and Elisp code compare, so here’s the above example written in Elisp.
(defun say-hello (&optional name) (let ((name (or name "World"))) (concat "Hello, " name "!")))
Okay, so it’s a lot shorter . But that’s okay, because most of the
code we write won’t be just tons of boilerplate wrapping trivial OCaml
functions. Instead, the point of Ecaml is to let OCaml do the heavy lifting,
so we only need to define the interface for our plugin using defun
and
then we can hack on whatever interesting functionality we want to provide.
2.2 defvar
and defcustom
These functions allow you to define Elisp variables4, similarly to what
defun
does for functions. The difference between them is that defcustom
defines a customizable variable. The Elisp manual, section 14.3:
“Customizable variables”, also called “user options”, are global Lisp variables whose values can be set through the Customize interface. Unlike other global variables, which are defined with ‘defvar’ (*note Defining Variables::), customizable variables are defined using the ‘defcustom’ macro. In addition to calling ‘defvar’ as a subroutine, ‘defcustom’ states how the variable should be displayed in the Customize interface, the values it is allowed to take, etc.
So, what does that mean for your plugin? Well, basically it just means that the variable can be set and queried interactively by the user using the Emacs Customize interface.
2.3 message
and co.
val message : string -> unit val message_s : Core_kernel.Sexp.t -> unit val messagef : ('a, unit, string, unit) Core_kernel.format4 -> 'a
The next three functions are simply various ways of printing messages to the echo area. The echo area is the tiny area at the very bottom of the Emacs window, where messages will appear, such as “(No changes needed to be saved)” when you try to save a file you already just saved.
The Elisp function message
displays a new message in the echo area.
Messages are also appended to the special *Messages*
buffer, so you can
view that buffer to see any messages you may have accidentally dismissed.
message
and message_s
accept a string
and a Sexp.t
5,
respectively. (Sexp.t
is from Jane Street’s sexplib.)
messagef
also prints to the echo area but it accepts a formatting string
and arguments the same way printf
does, e.g.,
let () = messagef "the answer is %d" 42
2.4 provide
provide
registers a feature with Emacs, basically telling Emacs that the
plugin was successfully loaded. This is needed in order to load the plugin
using require
(or Emacs will complain). It also prevents Emacs from trying
to load the same plugin again, since require
checks to see if its argument
has been registered as a feature before loading a plugin. Call provide
when
your plugin is finished setting up.
3 Feature rundown
What follows is a listing of the current modules in Ecaml (as of version v0.9.115.24+69), each with a brief summary of the functionality provided within.
Many of the modules correspond one-to-one with concepts in Emacs better-documented in the manual than I can do here. Module names are linked to the appropriate section of the Emacs or Emacs Lisp manuals.
- Advice
- Elisp has a system for “advising” functions, ways of adding to or modifying the behavior of a function without completely redefining it by writing a new function that is called before, after, or in place of the old one, etc. Ecaml currently only supports around advice.
- Ansi_color
- Contains functions for interpreting ANSI color escape sequences in Elisp strings and translating them into Elisp string properties which encode equivalent colors. Such strings might often be the result of running terminal-oriented version control or diff programs.6
- Auto_mode_alist
- Manages the variable
auto-mode-alist
, which determines how Emacs decides what major mode to open a file in, based on the filename. - Backup
- Manages the variable
make-backup-files
, which controls whether Emacs will make backup files (likefoo.c~
) when you edit files. - Buffer
- Everything to do with buffers, including killing buffers, displaying them, and finding out what files they’re visiting.
- Char_code
- Internally, Emacs represents characters as just a code point (integer). Technically, these are a superset of Unicode code points.
- Color
- Deals with colors and Color Names, which are various forms of plain English and RGB strings that Emacs can interpret as colors and display. See M-x list-colors-display for a list of available colors.
- Command
- A command is simply a function that can be called interactively, e.g., through M-x. It should specify a way of receiving its arguments interactively, such as by prompting the user for input, rather than through the normal function call mechanism.
- Comment
- Manages the active comment syntax.
- Compilation
- Not sure what this does.
- Current_buffer
- Emacs has a notion of the current buffer, which many Elisp functions operate on by default instead of accepting a buffer or buffer name as an argument. This module allows you to set the current buffer and use those functions.
- Customization
- Manages the definition and organization of customization items, which allow users to customize variables and (font) faces through an organized, interactive user interface.
- Directory
- Functions for managing file directories.
- Echo_area
-
message
and co. print their messages here, at the bottom of Emacs’s frame. Also allows you to temporarily inhibit messages in the echo area (but they’re still logged to*Messages*
). - Face
- Emacs’s notion of fonts, including font families, sizes, weights, styles, and decorations.7
- Feature
- Emacs records a set of named features provided by packages (they’re just symbols). Code that depends on a given feature can require it, which causes Emacs to load the package that provides it—unless it has already been loaded.
- File
- Functions for managing files (renaming, writing, permissions, etc.).
- Filename
- Functions for managing filenames (extensions, relative/absolute paths, etc.)
- Find_function
-
find-function
jumps to the source code where a function is defined, given its name. It’s not built-in as part of Emacs, but rather is defined by the Find Func package. - Form
- Lisp treats all code as plain old data (cons cells and symbols and
numbers—oh my!). Module
Form
contains functions specifically related to treating Lisp values as code (e.g.,eval
andquote
). - Frame
- Manages Emacs frames, which you probably call “windows” if you run GUI Emacs (as opposed to in the terminal). For Emacs’s windows, see below.
- Function
- Manages Elisp functions.
Function.Fn.t
is the common signature for all OCaml functions that are to be called from within Emacs. - Grep
- Facility for running
grep
from within Emacs.grep
is run asynchronously, and the results are collected in the*grep*
buffer. - Hash_table
- Elisp’s built-in hash table data structure.
- Hook
- Hooks identify logical places where you can register functions to be
called. For example, you could attach a function to
before-save-hook
to remove trailing whitespace from a file you’re editing before you save it. - Input_event
- Deals with user input events, such as mouse clicks and key
presses. You could use
Input_event.modifiers
to determine whether the Control key was held down while a key was pressed. - Key_sequence
- A sequence of one or more key presses that form a unit, such
as C-x C-f. This module can be used to read key
sequences from user input or simulate key presses using
execute-kbd-macro
. - Keymap
- Keymaps relate input events to commands or other keymaps (allowing multiple input events to correspond to a single command). This is how keys are bound to commands, and we can use these to provide key bindings for any commands we provide in our plugin.
- Load
- Load another plugin, usually a Lisp file. Also contains
path
, which allows you to examine the load path, which is a list of directories where Emacs will look for the file name you pass toload
. - Load_history
-
Best described by an excerpt from the doc comments:
(* [update_emacs_with_entries] updates [load-history] with the information supplied to [add_entry], which make it possible to, within Emacs, jump from a symbol defined by Ecaml to the Ecaml source. *)
Used behind-the-scenes by
defcustom
,defun
,defvar
, and so on. - Major_mode
- Manages major modes, major mode keymaps, derived modes, etc.
- Marker
- Markers keep track of a certain position in a buffer. They are automatically adjusted when the buffer is edited, so that they maintain the same logical position if not the same offset.
- Minibuffer
- Often confused with the Echo Area (I used this term wrong in the first post of this series). Used for reading input from the user. For example, when you run an interactive command through M-x, the minibuffer is where you type the name of the command you want to run.
- Minor_mode
- Secondary modes you can enable and disable that provide additional functionality on top of the major mode.
- Obarray
- An internal Emacs data structure that stores a set of symbols,
for use with
intern
andread
. Normally there is only one, stored in the variableobarray
, so there is only one symbol with a given name. - Point
- The location of the cursor in a buffer. Module
Point
contains functions for setting the point and searching forward for a given string, for example. - Position
- A number that describes the position of a character or cursor within a buffer. Starts at 1. Doesn’t automatically move when the buffer is edited, unlike a marker.
- Process
- Manages child process of Emacs, such as shells,
ispell
, or Merlin. This also includes the Emacs server. - Q
- Contains a large number of symbols, including keyword symbols (module
Q.K
) and ampersand-symbols like&optional
(moduleQ.A
). These are all stored here to avoid unnecessarily allocating OCaml data structures to refer to these symbols whenever they’re needed. - Regexp
- Elisp regular expressions. They have slightly different syntax from regular expressions you might use in Perl or Perl-compatible regexps.8
- Selected_window
- Similar to
Current_buffer
, but for Emacs’s “windows”. - Symbol
-
Identifiers in Elisp are converted to symbols, which are “interned”. This means that two symbols with the same name9 are physically the same object, and so can be compared using pointer arithmetic. Module
Symbol
also contains functions for calling Elisp functions from OCaml, with convenience functions for different arities.For example,
Symbol.funcall1
accepts a symbol, which should denote the name of a function, and a single argument, which is passed to the function. Functions in this module whose names end in_i
ignore the return value and returnunit
instead ofValue.t
. These are useful since a lot of Elisp functions with side effects don’t return anything useful.Most of the other modules' functionality are built on top of
Symbol
and its function-calling functions. - Syntax_table
- Syntax tables are used to provide language-specific functionality, e.g., syntax highlighting.
- System
- Interacts with the operating system. Currently contains functions for setting and querying environment variables.
- Text
- Represents Emacs’s strings, which contain not only characters but also text properties, which enhance text to provide everything from (font) faces to read-only status to text-based “buttons”. See the manual for more on text properties.
- Timer
- Timers allow you to schedule a function to be run at a future time, possibly more than once.
- User
- Manages operating system users, such as their login names and UIDs.
- Value
- Lisps are dynamically typed, and every type of value is ultimately
a subtype of
Value.t
. Contains a lot of type predicates that allow you to test the type of an arbitrary value. - Var
- Special features available to variables, such as setting a buffer-local value or a default value.
- Vector
- Elisp’s built-in vector data structure.
- Window
- Confusingly, Emacs uses the term window to refer to what most people might call window panes. A window displays a buffer.
- Working_directory
- Each buffer, as well as Emacs itself, has a working
directory. Relative paths are resolved relative to this directory.
Working_directory.within
runs a function with the current working directory set to a given value.
4 Phew!
That was quite a few modules. We’ll use some of them next week in building our interpreter plugin. Catch you next time.
Footnotes:
Plus, I have no idea what many of the individual pieces do.
ppx_jane
, Jane Street’s set of ppx rewriters, includes ppx_here
, so
if you already have that installed, you can use it to provide [%here]
. You’ll
need to add the following to your jbuild
file:
(jbuild_version 1) (executables ((names (main)) (libraries (ecaml)) (preprocess (pps (ppx_jane))))) ; add this line
If you’re not keen on using syntax extensions, you can use OCaml’s
built-in __POS__
macro. Unfortunately, this macro returns a tuple instead of a
record, but since all the fields are in the right order, you can do Obj.magic
__POS__
to get a Lexing.position
. But don’t tell anyone I told you that!
Elisp is a Lisp-2, meaning that functions and variables live in two different and non-overlapping namespaces. That’s why defining functions and variables uses different mechanisms (unlike in, say, OCaml!).
See? Interfacing OCaml with Lisp isn’t that weird after all!
In fact, this was one of the key original motivations for Ecaml.
Surprisingly, faces in Elisp live in a completely separate namespace from variables and functions. So perhaps Elisp is a Lisp-3?
For example, the grouping operator is \( ... \)
, not ( ... )
. Plain
old parentheses just match literal parentheses in the target string.
in the same obarray.