ANEForge - using the tool¶
ANEForge is a direct, CoreML-free Python frontend for the Apple Neural Engine:
it lowers a tensor graph into one fused Espresso e5rt program and dispatches
it to ANE silicon from an ordinary user process, with no CoreML and no special
entitlement. These docs are the software manual - how to install, call,
train, target, and extend the frontend.
By task¶
Get something running¶
- Getting started - install, build, first program.
- aneforge API - the graph -> compile -> run frontend reference.
- Training on the ANE - on-ANE autograd and the
Trainerloop. - FAQ - common questions, gotchas, what to expect.
- MIL primer - writing MIL programs by hand.
Use the engine well¶
- Cross-chip deployment - compiling and gating for other ANE
families (M1-M5, 28 targets),
cross_compile_check,detect_family, fp16 portability. - Dispatch backends - Path A vs e5rt vs MPSGraph vs CoreML, and which to use.
- e5rt dispatch reference - the full e5rt path: call
sequence, the
ane_e5rt_*C ABI, multi-op / async / pipelining / IOSurface.
Know what works¶
- Capabilities - operator coverage, dtype matrix, known limits.
- Op catalog - every native MIL op x device (M1-M5), generated from
the package's
_op_catalog.py(the runtimeaf.op_infodata); the exhaustive Y/~/N table.
Contribute¶
- Development - building, testing, adding ops.
- Glossary - terminology used across docs and code.
- Roadmap - next directions, open unknowns, and known bottlenecks.
By question¶
| Question | Document |
|---|---|
| How do I install + run? | getting-started |
| How do I use the Python frontend? | aneforge-api |
| How do I train a model on the ANE? | training |
| Can I target / deploy to another chip (M1-M5)? | cross-chip |
| How do I estimate latency without the hardware? | aneforge-api: cost estimation, cross-chip |
| How do I shrink weights (int4 / sparse)? | aneforge-api: weight compression |
| What ops are supported? | capabilities, op-catalog |
| What's "Path A"? | glossary, dispatch |
| How do I write a MIL program? | mil-primer |
| Why fp16? | faq, capabilities |
| Can I use this in production? | faq |
| What macOS versions are supported? | faq |
| How do I add a new operator? | development |
| Why does my call take 195 ms? | dispatch |