Count me in as another one with a longstanding mostly dream project aiming for human enjoyable notation grammar.
For me it was coming from tracker notation (buzz), where i was wildly underwhelmed by all that whitespace for timing (well, empty cells for timing) and the lack of parameterizable macros. A seriously underexplored field, perhaps because almost everybody who ever started got pulled in by the lure of textually defined synthesis.
ABC notation is more oriented towards traditional sheet music, with regular note lengths, standard Western tuning and a simple, readable syntax. It isn't meant for playing back music that sounds good to the ear. It's hard to catch the nuances of a real human performance with it, but it works well as a lead sheet for musicians. Its expressive marking are relatively limited and interpreted subjectively.
MTXT focuses on editable recordings of live performances, preserving all of those tiny irregularities that make the music human. It can represent arbitrary timings, subtle expressive variations and even arbitrary tuning systems. MTXT can also capture transitions like crescendos and accelerandos exactly as they happened.
\relative c' {
\key d
\major
fis4 fis g a
a g fis e
d d e fis
fis4. e8 e2
}
...but why is it so complicated? A novice interpretation of "music" is "a bunch of notes!" ... my amateur interpretation of "music" is "layers of notes".
You can either spam 100 notes in a row, or you effectively end up with:
melody = [ a, b, [c+d], e, ... ]
bassline = [ b, _, b, _, ... ]
music = melody + bassline
score = [
"a bunch of helper text",
+ melody,
+ bassline,
+ page_size, etc...
]
...so Lilypond basically made "Tex4Music", and the format serves a few dual purposes:
Engraving! Basically "typesetting" the music for human eyeballs (ie: `*.ly` => `*.pdf`).
Rendering! Basically "playing" the music for human ears (ie: `*.ly` => `*.mid`)
Librarification! Basically, if your music format has "variables" and "for-loops", you can end up with an end score that's something like: `song = [ intro + chorus + bridge + chorus + outro ]`, and then not have to chase down and modify all the places you use `chorus` when you modify it. (See this answer for more precision: https://music.stackexchange.com/a/130894 )
...now imagine doing all of the above for multiple instruments and parceling out `guitar.pdf`, `bass.pdf`, `drums.pdf` and `whole-song.pdf`
TL;DR: Music is haaard, and a lot closer to programming than you think!
sporkl 1 hours ago [-]
Lilypond is the only music engraving system I'm aware of that can handle polytempo scores. The TEX-ness really comes in handy.
vessenes 8 hours ago [-]
Cool. My one concern with this is that it has no horizontally scannable note/chord mode. It’s super common for humans to read a sequence of notes left to right, or write it that way, but it’s also just more efficient in terms of scanning / reading.
Can I suggest a guarded mode that specifies how far apart each given note/chord is by the count, e.g.
#1.0:verse1
Am - C - G - E - F F F F
#
You could then repeat this or overlay a melody line like
Etc. I think this would be easier to parse and produce for an LLM, and it’s would compile back to the original spec easily as well.
daninet 7 hours ago [-]
I considered it but decided against it in the first version, because specifying note durations is too tricky. It was more important to get the .mid -> MTXT conversion and live-performance recording working, where notes usually have irregular note lengths.
Representations like "C4 0.333 D4 0.333 E4 0.25" feel too hard to read.
matheusmoreira 5 hours ago [-]
To me it seems like files could get hard to understand if events that happen simultaneously aren't horizontally lined up like this:
2.0 voice1 | voice2 | ...
Like a text version of old school tracker interfaces:
This made me remember old set of tools called mtx2midi and midi2mtx, I used them to edit some midi files while making sure I'm not introducing any unwanted changes.
While roundtrip output was not binary identical, it still sounded the same.
Looks like MTXT tool here does not quite work for this use case, the result of the roundtrip of a midi I tried has a segment folded over, making two separate segments play at the same time while the total duration got shorter.
It reminded me of ABC and the tools abc2midi and midi2abc.
chaosprint 4 hours ago [-]
Some simple thoughts:
I feel that one challenge of programming languages is how to remember these rules, formats, and keywords. Even if you're using familiar formats like YAML or JSON, how do you match keywords?
When developing Glicol (http://glicol.org/), I found that if it's based on an audio graph, all node inputs and outputs are all signals, which at least reduces the matching problems. The remaining challenge is ensuring that reference documentation is available at the minimal cost.
jasonjmcghee 4 hours ago [-]
This would lend itself well to a live-coding/live-music experience.
I played around with a similar idea on my own (very simple / poor) text music environment:
in the middle of making an extension to allow making vs code extensions live because I wanted a faster development feedback loop.
rock_artist 9 hours ago [-]
Hey, the idea is nice,
It would be great to know what pushed you to start this format.
Also, any apps that uses it would benefit from being add to the repo assuring usability in addition to readibility.
daninet 7 hours ago [-]
My initial goal was to fix some mistakes in the MIDI files I recorded from my keyboard. I was also interested in making dynamic tempo and expression changes without dealing with complicated DAW GUIs.
Now I'm working on a synth that uses MTXT as its first-class recording format, and it's also pushing me to fine-tune a language model on it.
1313ed01 10 hours ago [-]
I like the idea overall. Looks like something that would be fun to combine with music programming languages (SuperCollider/Of etc).
Not so sure how human-friendly the fractional beats are? Is that something that people more into music than I am are comfortable with? I would have expected something like MIDIs "24 ticks per quarter note" instead. And a format like bar.beat.tick. Maybe just because that is what I am used to.
daninet 7 hours ago [-]
The library has MIT license, I would be more than happy to see people use it in different synths.
I'm planning to add support for math formulas in beat numbers, something like: "15+/3+/4" = 15.58333
soperj 6 hours ago [-]
> "15+/3+/4"
Can you explain how to read that? 15 plus divided by 3 plus divided by 4?
daninet 6 hours ago [-]
It's a shorthand for 15 + (1/3) + (1/4), but I'm still not settled on the syntax.
bonzini 8 hours ago [-]
It should be fine, but fractions (or both fractions and decimals) would be preferable in order to express triplets (3 over 2, effectively a duration of 0.3333...)
intrasight 9 hours ago [-]
I think that for completeness it needs looping and conditional constructs
6 hours ago [-]
xrd 6 hours ago [-]
I've been spending the last week casually looking at strudel.cc.
They have a notation that looks similar (basically a JavaScript port of the Haskell version).
I like this, but I'm curious why I would want to use this over strudel. Strudel blends the language with a js runtime and that's really powerful and fun.
Which has a text format, and typesets it for you nicely.
throw7 6 hours ago [-]
It makes no sense to design for llm's. Do what makes sense for the reader and forget that llm's exist at all.
amingilani 6 hours ago [-]
What prompted this and why does it not?
badlibrarian 5 hours ago [-]
It's not the 19th Century. You don't need to punch holes in cards to help the machine "think" any more.
otabdeveloper4 5 hours ago [-]
> You don't need to punch holes in cards to help the machine "think" any more.
That's literally what "prompt engineering" is, though.
badlibrarian 4 hours ago [-]
"Transpose this MIDI file down a third" requires neither a specialized data format nor fancy prompt engineering. ChatGPT asked: "A) Major third up (+4 semitones) or B) Minor third up (+3 semitones)" then did it.
formula1 3 hours ago [-]
This is pretty neat
I'm wondering if it can be used alongside strudal
https://strudel.cc/
Either mtxt => strudal or strudal => mtxt
For me it was coming from tracker notation (buzz), where i was wildly underwhelmed by all that whitespace for timing (well, empty cells for timing) and the lack of parameterizable macros. A seriously underexplored field, perhaps because almost everybody who ever started got pulled in by the lure of textually defined synthesis.
www.colinfraser.com/m5000/ample-nucleus-pg.pdf
For macros, see:
https://www.retro-kit.co.uk/user/custom/Acorn/3rdParty/Hybri...
https://www.retro-kit.co.uk/user/custom/Acorn/3rdParty/Hybri...
https://en.wikipedia.org/wiki/ABC_notation https://abcnotation.com/
MTXT focuses on editable recordings of live performances, preserving all of those tiny irregularities that make the music human. It can represent arbitrary timings, subtle expressive variations and even arbitrary tuning systems. MTXT can also capture transitions like crescendos and accelerandos exactly as they happened.
* Perl MIDI::Score -- https://metacpan.org/pod/MIDI::Score
* Csound standard numeric scores -- https://csound.com/docs/manual/ScoreTop.html
* CsBeats (alternative score language for Csound) -- https://csound.com/docs/manual/CsBeats.html
https://en.wikipedia.org/wiki/LilyPond#Integration_into_Medi...
https://www.mutopiaproject.org
https://lilypond.org/text-input.html
...but why is it so complicated? A novice interpretation of "music" is "a bunch of notes!" ... my amateur interpretation of "music" is "layers of notes".You can either spam 100 notes in a row, or you effectively end up with:
...so Lilypond basically made "Tex4Music", and the format serves a few dual purposes:Engraving! Basically "typesetting" the music for human eyeballs (ie: `*.ly` => `*.pdf`).
Rendering! Basically "playing" the music for human ears (ie: `*.ly` => `*.mid`)
Librarification! Basically, if your music format has "variables" and "for-loops", you can end up with an end score that's something like: `song = [ intro + chorus + bridge + chorus + outro ]`, and then not have to chase down and modify all the places you use `chorus` when you modify it. (See this answer for more precision: https://music.stackexchange.com/a/130894 )
...now imagine doing all of the above for multiple instruments and parceling out `guitar.pdf`, `bass.pdf`, `drums.pdf` and `whole-song.pdf`
TL;DR: Music is haaard, and a lot closer to programming than you think!
Can I suggest a guarded mode that specifies how far apart each given note/chord is by the count, e.g.
You could then repeat this or overlay a melody line like Etc. I think this would be easier to parse and produce for an LLM, and it’s would compile back to the original spec easily as well.https://youtu.be/eclMFa0mD1c
Looks like MTXT tool here does not quite work for this use case, the result of the roundtrip of a midi I tried has a segment folded over, making two separate segments play at the same time while the total duration got shorter.
https://files.catbox.moe/5q44q0.zip (buggy output starts at 42 seconds)
I created an issue here: https://github.com/Daninet/mtxt/issues/1
I feel that one challenge of programming languages is how to remember these rules, formats, and keywords. Even if you're using familiar formats like YAML or JSON, how do you match keywords?
When developing Glicol (http://glicol.org/), I found that if it's based on an audio graph, all node inputs and outputs are all signals, which at least reduces the matching problems. The remaining challenge is ensuring that reference documentation is available at the minimal cost.
I played around with a similar idea on my own (very simple / poor) text music environment:
https://github.com/jasonjmcghee/vscode-extension-playground?...
in the middle of making an extension to allow making vs code extensions live because I wanted a faster development feedback loop.
Also, any apps that uses it would benefit from being add to the repo assuring usability in addition to readibility.
Now I'm working on a synth that uses MTXT as its first-class recording format, and it's also pushing me to fine-tune a language model on it.
Not so sure how human-friendly the fractional beats are? Is that something that people more into music than I am are comfortable with? I would have expected something like MIDIs "24 ticks per quarter note" instead. And a format like bar.beat.tick. Maybe just because that is what I am used to.
I'm planning to add support for math formulas in beat numbers, something like: "15+/3+/4" = 15.58333
Can you explain how to read that? 15 plus divided by 3 plus divided by 4?
They have a notation that looks similar (basically a JavaScript port of the Haskell version).
I like this, but I'm curious why I would want to use this over strudel. Strudel blends the language with a js runtime and that's really powerful and fun.
https://www.vexflow.com/
Which has a text format, and typesets it for you nicely.
That's literally what "prompt engineering" is, though.
I'm wondering if it can be used alongside strudal https://strudel.cc/ Either mtxt => strudal or strudal => mtxt
Heres strudal in action https://www.youtube.com/shorts/YFQm8Hk73ug