Widely-used ML Frameworks
Develop and deploy networks from high-level Python frameworks like PyTorch.
- from fmot import ConvertedModel
- model = MyModel(). # Start from a PyTorch model
- # Convert and Quantize
- cmodel = ConvertedModel(model, batch_dim=0, seq_dim=1)
- cmodel.quantize(quantization_inputs) # Quantize the model on a set of inputs
- fqir_graph = qmodel.trace() # Serializable IR for the quantized model
- from femtodriver import Femtodriver
- # Simulate power consumption, memory footprint and latency of your model;
- with Femtodriver() as fd:
- meta_dir, femtofile_path, femtofile_size = fd.compile(fqir_graph)
- metrics = fd.simulate(
- input_period # Time between frames to estimate leakage
- input_file # The file to run the simulation on
- )
- # Generate binaries to deploy on the SPU
- with Femtodriver() as fd:
- meta_dir, femtofile_path, femtofile_size = fd.compile(fqir_graph)
- # Optional: Sparsify your model during training
- for batch in train_dataloader:
- pruner.prune(model)
- optimizer.zero_grad()
- outputs = model(batch)
- loss = loss_fn(outputs, batch)
- loss.backward()
- optimizer.step()
Prune model to sparse representation that SPU is uniquely designed to accelerate
- # Optional: Run Quantization-Aware Training
- for batch in train_dataloader:
- optimizer.zero_grad()
- outputs = qmodel(batch)
- loss = loss_fn(outputs, batch)
- loss.backward()
- optimizer.step()
Notice
This website stores cookies on your computer. These cookies, as specified in our Privacy Policy, are used to collect information about how you interact with our website. We use this information in order to improve and customize your browsing experiences.
Use the “Accept” button to consent to the use of such technologies. Use the “Reject” button to continue without accepting.