Software and Tools

Software Development Kit (SDK)

We’ve built our software development platform to help companies of all sizes deploy optimal sparse AI models for tomorrow’s applications and form factors.

Our SDK contains advanced sparse model optimization tools, a custom compiler, and a fast performance simulator. It’s everything you need from exploration to deployment.

Widely-used ML Frameworks

Develop and deploy networks from high-level Python frameworks like PyTorch.

Easy Optimization

Easily prune, quantize, and fine tune sparse models with our model optimization tools.

Rapid Development

Estimate energy, latency, throughput, and model footprint from a Python API.

Get Started

It's a simple 3-step process

1. Quantize
from fmot import ConvertedModel
model = MyModel(). # Start from a PyTorch model

# Convert and Quantize
cmodel = ConvertedModel(model, batch_dim=0, seq_dim=1) 
cmodel.quantize(quantization_inputs)  # Quantize the model on a set of inputs
fqir_graph = qmodel.trace()           # Serializable IR for the quantized model

Quantize model and convert to serializable intermediate representation

2. Simulate
from femtodriver import Femtodriver
# Simulate power consumption, memory footprint and latency of your model;
with Femtodriver() as fd:
  meta_dir, femtofile_path, femtofile_size = fd.compile(fqir_graph)
  metrics = fd.simulate(
      input_period  # Time between frames to estimate leakage
      input_file    # The file to run the simulation on
  ) 

Simulate energy, latency, throughput, and footprint of your model for rapid model development

3. Deploy
# Generate binaries to deploy on the SPU
with Femtodriver() as fd:
  meta_dir, femtofile_path, femtofile_size = fd.compile(fqir_graph)

Use the Femtocrux compiler to deploy to the SPU for verification, testing, and production

Further Optimizations

Push beyond the limits for unparalleled efficiency

4. Sparsify
# Optional: Sparsify your model during training
for batch in train_dataloader:
  pruner.prune(model)
  optimizer.zero_grad()
  outputs = model(batch)
  loss = loss_fn(outputs, batch)
  loss.backward()
  optimizer.step()

Prune model to sparse representation that SPU is uniquely designed to accelerate

5. Quantization Aware Training
# Optional: Run Quantization-Aware Training
for batch in train_dataloader:
  optimizer.zero_grad()
  outputs = qmodel(batch)
  loss = loss_fn(outputs, batch)
  loss.backward()
  optimizer.step()

Run quantization-aware training to maintain high model performance despite tiny scale

Get Started

Ready to start developing?

Software Development Kit (SDK)

Widely-used ML Frameworks

Easy Optimization

Rapid Development

Get Started

1. Quantize

2. Simulate

3. Deploy

Further Optimizations

4. Sparsify

5. Quantization Aware Training

Get Started

Get updates