Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Nov 26 23:16

    maltanar on vitis_hls

    [Concat] fix stitching for Viti… (compare)

  • Nov 26 18:35
    rpitonak opened #455
  • Nov 26 12:34

    maltanar on vitis_hls

    [Deps] update to xdist 2.4.0 to… (compare)

  • Nov 26 11:02

    maltanar on vitis_hls

    [MaxPool] switch to non-_Batch … (compare)

  • Nov 26 09:47

    maltanar on vitis_hls

    [Concat] first sketch of HLSCus… [Concat] fixes to concat op, ad… [Concat] add InferConcatLayer t… and 40 more (compare)

  • Nov 24 12:04

    maltanar on pool_flex

    (compare)

  • Nov 24 12:04

    maltanar on dev

    [Deps] update hlslib [Pool] adjust kernel param as t… [Pool] change Pool_Batch to use… and 6 more (compare)

  • Nov 24 12:04
    maltanar closed #451
  • Nov 24 11:44
    maltanar synchronize #451
  • Nov 24 11:44

    maltanar on pool_flex

    [Deps] update hlslib (compare)

  • Nov 24 11:43
    maltanar edited #451
  • Nov 24 11:42
    maltanar edited #451
  • Nov 24 11:41
    maltanar opened #451
  • Nov 24 10:41

    maltanar on pool_flex

    [Pool] more steps towards non-s… [Test] enable 1D tests + restru… (compare)

  • Nov 23 15:29

    maltanar on pool_flex

    [ConvertToHLS] round of fixes t… (compare)

  • Nov 23 13:40

    maltanar on pool_flex

    [Test] support creating 1d maxp… (compare)

  • Nov 23 13:26

    maltanar on pool_flex

    [Pool] change Pool_Batch to use… (compare)

  • Nov 23 10:01

    maltanar on pool_flex

    [Deps] update hlslib [Pool] adjust kernel param as t… (compare)

  • Nov 20 16:08

    maltanar on concat

    (compare)

  • Nov 20 16:08

    maltanar on dev

    [Concat] first sketch of HLSCus… [Concat] fixes to concat op, ad… [Concat] add InferConcatLayer t… and 28 more (compare)

45vishal
@45vishal
Hello, I am new to working with FINN and deep learning. I am planning to do my thesis on optimizing the FINN synthesis using VHDL/Verilog on the alveo accelerators. Can you help me proceed how to go about it?
mckr
@mckr
Hello everyone. First of all, thanks for this amazing work. I am investigating end2end example and trying to understand how this works but I don't understand a point related to input data bit conversion. The input image is 8 bit so FINN must convert into 1-bit in order to use in BNN. I look the input data before and after the conversion. I realized that if input data-8bit is non zero then after conversion input data-1bit is 1. I would expect that if input data is higher than some value(such as 128), then the input data-1 bit is 1. Could anyone clarify me about that point?
Yaman Umuroglu
@maltanar

@RSPwFPGAs > I'm glad to contribute to FINN. I have created the pull request for the first one. So far I have not seen anything wrong regarding the second one. I have another Vivado project on the host side and it can access the IPs created in the 3rd step in tfc_end2end_example. I will check the projects created in the 4th step.

I don't think you'll see any issues at any of the steps, or at all while you are inside the container, but if you try to open any of the generated Vivado projects in the host computer (e.g. because you want to use the GUI, the IP integrator etc.) those may have trouble with broken absolute paths.

Yaman Umuroglu
@maltanar

Hello everyone. First of all, thanks for this amazing work. I am investigating end2end example and trying to understand how this works but I don't understand a point related to input data bit conversion. The input image is 8 bit so FINN must convert into 1-bit in order to use in BNN. I look the input data before and after the conversion. I realized that if input data-8bit is non zero then after conversion input data-1bit is 1. I would expect that if input data is higher than some value(such as 128), then the input data-1 bit is 1. Could anyone clarify me about that point?

Hey @mckr -- you should have a look at either the PyTorch model at https://github.com/maltanar/brevitas_cnv_lfc/blob/master/training_scripts/models/TFC.py or the resulting graph during initial steps in the end2end flow to understand how the input quantization works. The PyTorch input is actually a float tensor with each value between 0 to 1 (even though the underlying data is 8 bit). Here you can see the model first remaps it to be between -1 and +1 by applying 2*x-1 and then applies the 1-bit input
quantizer, which maps everying to its sign.

Yaman Umuroglu
@maltanar

Hello, I am new to working with FINN and deep learning. I am planning to do my thesis on optimizing the FINN synthesis using VHDL/Verilog on the alveo accelerators. Can you help me proceed how to go about it?

Hi! Do you mean you want to replace the HLS backend we have with Verilog/VHDL modules instead? This may take quite a bit of effort as you'd have to create parametrizable modules that our templated HLS functions can do. The Alveo port would also need some extra effort, although the PYNQ team recently announced the first PYNQ for Alveo and that'd make it easier.

PASTANERD
@PASTANERD
Hello! I am studying the FPGA accelerator for deep learning inference with the BNN-PYNQ repository. I read some of the FINN papers and tried to implement the repository to my PYNQ-Z2 board. When I tested it, I realized that there is some accuracy drop by their framework. When I tested the CNV-W1A1 network with CIFAR10 10,000 test images on Theano, I got 79.1 % accuracy. When I tested the same network and test images on the PYNQ-Z2 board, however, I got 74.4% accuracy which is almost 5% less than Theano result. Due to my lack of knowledge, I am quite hard to understand what exactly makes this difference. To me, I can only guess that they are quite different because Theano computes it with float parameters while the PYNQ-Z2 board does with integer weights and thresholds. But I'm not sure this is the real reason. Is there anyone who can help me?
Yaman Umuroglu
@maltanar
@PASTANERD there seems to be a similar GitHub issue open here: Xilinx/BNN-PYNQ#134 -- perhaps by you? in any case, I'd follow the discussion there. Any more details you can provide about how you built/tested the accelerator and your setup will be helpful to diagnose the problem.
PASTANERD
@PASTANERD
Hello, @maltanar Thanks for a fast reply! The question in the Github is actually written by my friend who is studying with me. We just wanted to get any answers and I found this place so I posted a similar question here. I replied there with more details about what we've done and, frankly, there is nothing much we changed because we didn't train a new model. I look forward to receiving your reply soon!
Hendrik Borras
@HenniOVP
Hi,
the last few days I have been looking a bit at FINN’s end2end performance in execution time. And two performance related questions came up:
  1. When timing the tfc_end2end_example I noticed that especially the synthetisation of HLSCustomOp nodes using the HLSSynth_IPGen() transformation can take very long. This execution time seems heavily dependent on network size. Since each node is in essence an independent synthetisation I was wondering if it would be possible to parallelize this? Maybe even by using Pythons builtin multiprocessing tools and just starting all calls to Vivado at the same time?
  2. What for a system would you recommend for synthesis? I would guess a high clock CPU with large caches?
Yaman Umuroglu
@maltanar
Hi @HenniOVP -- welcome! 1. Synthesis times for certain transformations like IPGen can be definitely improved via parallelization as they need essentially embarrassingly parallel across nodes, I actually took some notes on how this could be implemented as part of FINN's transformation system a little while ago. Haven't actually gotten around to trying any of that yet though. Please let me know if this is something you would be interested in helping out with :) 2. Depending on the size of the synthesis, but a big beefy Xeon with large caches and lots of DRAM definitely helps. Parallel synthesis puts extra stress on the resources too, see https://forums.xilinx.com/t5/General-Technical-Discussion/Best-CPU-RAM-recommendation-for-Vivado-Logic-and-High-level/td-p/819755 for a discussion
13 replies
BTW I just found the "enable threaded conversations" switch in Gitter, let's use threads from now on to keep the chat a bit more organized
Quentin Ducasse
@QDucasse

Hi! I am currently my MSc Dissertation Thesis on mixed-precision applied to CNNs on FPGAs and found FINN as a really interesting project. I have two boards to conduct experiments: Nexys A7 and Zedboard 7000. Do you know if I could use FINN on any of those two?

Thanks in advance!

3 replies
anatoly
@anadb
Hello! I am experimenting with brevitas and my goal is to evaluate the sustainability of the pre-trained quantized NN against small changes in weights. So if I want to modify the weights tensor which method I should use model.fc1.weights or model.fc1.int_weights?
3 replies
kf7lsu
@kf7lsu
Hi, I am exploring the possibility of accelerating neural networks for high energy particle physics using FINN. I have been able to successfully load the docker image and run through the end to end examples. I am now looking at replicating one of our existing ML algorithms in Brevitas. Documentation about training using Brevitas seems sparse. I only found this example Xilinx/brevitas#47. Is there more documentation for training with Brevitas? It seems that the training dataset is supposed to contain bit width information while I just have it loaded in a numpy array. Is there a particular way to load datasets? I see something about unpack_input() and pack_output() in the forward functions of linear layers. When I add a relu layer at the front, I don't get the same error. I get a complaint about the input not being a tensor, which can be resolved by running torch.from_numpy(). After running that, I get an error saying that it was expecting a Double argument but received a Float argument, despite my data being loaded as Doubles. Any insights? I think that seeing more examples of training would help resolve all these issues.
2 replies
Vedanta Pawar
@vedantapawar
Hi, you guys are doing fantastic work with these toolkits. I have a quick question I want to train an object detection model on my own dataset so should I start with the BNN-PYNQ repository or should I start developing with the FINN repository. Thank you in advance.
3 replies
Hendrik Borras
@HenniOVP
Hi, I have some more general questions. Does FINN currently (or will at some point) support residual connections? I was looking at running ResNet in particular.
And second, does FINN support depth-wise convolutions or will at some point?
3 replies
Andrea Bachechi
@B4k3
Hi! I would like to make some benchmark on finn using some state of the art neural networks, such as tiny YoloV3 and Inception/GoogleNet.
Does finn has already been tested using these networks? And, if no, are there any guidelines to use finn with such networks?
Thanks in advance!
5 replies
bharaini
@bharaini
Hello,
I want to implement a Face recognition model on PYNQ Z2 FPGA and get the comparison results between CPU/GPU and FPGA just like how in case of BNN examples there is comparison between hardware and software classification. Would it be possible to do this using FINN onto PYNQ. Thanks in advance!
1 reply
Yaman Umuroglu
@maltanar
The next (v0.3b) release for FINN is imminent and we're just putting on the final touches. There's a slew of new features including initial support for convolutions, decoupled access-execute compute engines for faster p&r, automated FIFO insertion between layers, throughput testing, parallel synthesis/compilation transformations thanks to @HenniOVP and more. We're hoping to get this out on Friday, I'll keep the channel posted. You can always have a look at the staging branch if you are curious :-)
1 reply
Yaman Umuroglu
@maltanar
@/all FINN v0.3b is now tagged on GitHub: https://github.com/Xilinx/finn/releases/tag/v0.3b -- read the release blogpost here for more: https://xilinx.github.io/finn//2020/05/08/finn-v03b-beta-is-released.html
2 replies
Quentin Ducasse
@QDucasse

Hi,
First of all congrats on the v0.3!
I am currently doing the end to end example and I was wondering what is the best way you would advise to port this example on a Zedboard-7000 rather than the PYNQ? Will I be able to obtain the corresponding IPs if I simply change the fpga_part to the one corresponding to my board? I believe I should then manage the synthesis and deployment on my own?

Thanks in advance!

5 replies
dlphil
@dlphil

Great to see the activity in this community and to learn of the v0.3b release, @maltanar! I'm new to ML on FPGAs and am working through the tfc_end2end_example to get a lay of the land. When calling model.transform, however, I am encountering the following invalid argument error:
InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Failed to load model with error: Unknown model file format version.

I have made no changes to the FINN source or notebook. Is this a known issue? Any assistance in getting past this block is very much appreciated!

The full trace follows:

InvalidArgument Traceback (most recent call last)

<ipython-input-5-d0edac25bc01> in <module>
5
6 model = model.transform(InferShapes())
----> 7 model = model.transform(FoldConstants())
8 model = model.transform(GiveUniqueNodeNames())
9 model = model.transform(GiveReadableTensorNames())

/workspace/finn/src/finn/core/modelwrapper.py in transform(self, transformation, make_deepcopy)
100 while model_was_changed:
101 (transformed_model, model_was_changed) = transformation.apply(
--> 102 transformed_model
103 )
104 return transformed_model

/workspace/finn/src/finn/transformation/fold_constants.py in apply(self, model)
52 # this node has no dynamic inputs, only constant ones -- so we can
53 # do constant folding.
---> 54 oxe.execute_node(n, execution_context, graph)
55 # use the execution result as an initializer
56 model.set_initializer(node_out, execution_context[node_out])

/workspace/finn/src/finn/core/onnx_exec.py in execute_node(node, context, graph)
85 input_dict[inp] = context[inp]
86
---> 87 sess = rt.InferenceSession(node_model.SerializeToString())
88 output_list = sess.run(None, input_dict)
89

/opt/conda/lib/python3.6/site-packages/onnxruntime/capi/session.py in init(self, path_or_bytes, sess_options, providers)
23 self._path_or_bytes = path_or_bytes
24 self._sess_options = sess_options
---> 25 self._load_model(providers)
26 self._enable_fallback = True
27

/opt/conda/lib/python3.6/site-packages/onnxruntime/capi/session.py in _load_model(self, providers)
41 raise TypeError("Unable to load from type '{0}'".format(type(self._path_or_bytes)))
42
---> 43 self._sess.load_model(providers)
44
45 self._session_options = self._sess.session_options

InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Failed to load model with error: Unknown model file format version.

4 replies
Andrea Bachechi
@B4k3

Hi, is there a list of the layers finn currently supports?

Thanks in advance!

2 replies
Quentin Ducasse
@QDucasse

Hi,
I am trying to complete the end to end examples with the CNV and TFC networks. I manage to obtain the stitched IP project but after applying the transformationMakePYNQProject, no resizer.* file is created. I only have the vivado_pynq_proj_xxx directory with the two shell scripts (make_project.sh and synth_project.sh) and the tcl config script ip_config.tcl.

Am I missing something obvious or is this an issue? It can be reproduced inside the Docker container in both v0.3 and 0.2

Thanks in advance!
Quentin

9 replies
Yaman Umuroglu
@maltanar
@/all it's now even easier to contribute to FINN! We now have a GitHub Actions automated workflow to test incoming PRs using the quicktest (FINN tests marked as non-slow, non-Vivado). You can see the status in the PR page and click on details to get the full console output showing all tests, you can see an example in https://github.com/Xilinx/finn/pull/106/checks?check_run_id=696165825 by expanding the DockerRunQuicktest tab. Any new PR that branches off from Xilinx/finn:dev as of its current commit will automatically be checked and you don't have to do anything else.
Yaman Umuroglu
@maltanar
I'm going to keep posting occasional smaller updates in the channel, hope folks find them useful :)
1 reply
We're starting to put in performance regression tests into the full FINN test suite that runs every night. Below you can see two examples on the PYNQ-Z1, one for a small binarized FC network on MNIST with II=16 and one for a pure FIFO (for testing the shell efficiency) with II=2 (twice the width of the DRAM port). The PYNQ invocation overhead (just for launching the DMAs and waiting for the result, no data packing etc) is about 0.4 ms which isn't all bad. The efficiency is 90%+ if we disregard the invocation overhead.
  tfc_w1a1 Throughput Test Results
  -----------------------------
  From linear regression:
  Invocation overhead: 0.417612 ms
  Time per sample: 0.000702 ms
  Raw data:
  N        runtime[ms]      fclk[mhz]        fps              DRAM rd[Mb/s]    DRAM wr[Mb/s]   
  1        0.4549           100.0            2198.27          0.25             0.09            
  10       0.4644           100.0            21531.33         2.41             0.86            
  100      0.4587           100.0            217999.17        24.42            8.72            
  1000     1.0672           100.0            937065.24        104.95           37.48           
  10000    7.4465           100.0            1342907.82       150.41           53.72           
  -----------------------------
  FIFO Throughput Test Results
  -----------------------------
  From linear regression:
  Invocation overhead: 0.448837 ms
  Time per sample: 0.000019 ms
  Raw data:
  N        runtime[ms]      fclk[mhz]        fps              DRAM rd[Mb/s]    DRAM wr[Mb/s]   
  1        0.4725           100.0            2116.2           0.03             0.03            
  10       0.4683           100.0            21355.93         0.34             0.34            
  100      0.457            100.0            218795.2         3.5              3.5             
  1000     0.4549           100.0            2198272.54       35.17            35.17           
  10000    0.6037           100.0            16565181.67      265.04           265.04          
  100000   2.4002           100.0            41663891.92      666.62           666.62          
  -----------------------------
Hendrik Borras
@HenniOVP
Hi,
recently I have started working on an implementation of column pruning for convolutions for finn-hlslib. You can find my current progress here: https://github.com/HenniOVP/finn-hlslib/tree/feature/col_pruning
And a few questions came up: The finn-hlslib mentions a Contributor License Agreement, however the link leading there seems dead, so I was wondering what is written in the Contributor License Agreement? And are there certain rules for contributions and pull requests?
And more generally: Has pruning already been discussed within FINN? And are there any thoughts or opinions on this? Column pruning is likely the most basic and coarse approach here. Have other granularities been considered?
17 replies
Konrad Lis
@konrad966_gitlab
Hi,
I have registered to Xilinx Open Hardware 2020.
In current stage of the project I am planning to use Brevitas/FINN framework to quantize and run network on Zynq SoC.
However, the Xilinx Open Hardware project submission date is 30th of June 2020.
There are a few missing elements that hold me back from running the network through FINN workflow.
I would like to ask if I can expect them to be working before contest submission time:
  1. Two and more bits for activation after convolutional layer.
  2. Two bits for weights.
  3. Not aligned MaxPooling (if the MaxPooling does not fit in input tensor integral number of times, e.g. input of shape 11x11 and MaxPooling of size 2x2).
  4. Convolutions: stride that does not fit integral number of times in kernel shape (e.g. kernel 3x3 and stride 2 seems to be not working at some stage of FINN workflow).
  5. Convolutions with padding.
  6. Transposed convolution layers.
9 replies
XUEYULIANG
@XUEYULIANG
Hi,
I'm not familiar with linux, and I got trouble with open the jupyter notebook in the docker which can be seen in picture.
Can you help me?
Thank you.
2 replies
1.png
Yaman Umuroglu
@maltanar
unnamed.png
When looking for the performance bottleneck in a streaming system, one useful tool is to look at the ready/valid handshakes on streaming interfaces over time and look for particular patterns (e.g. valid=1 ready=0 implies either that node or something downstream from it is holding us up). If you have a large system with lots of streaming interfaces (say, a DNN with many layers in FINN) that's a lot of interfaces to look at manually. We now have utility functions in FINN to help with this task. Above you can see the top-3 streaming interfaces (extracted by name from a VCD rtlsim trace) that have the highest percentage of valid=1 ready=0 state, and all the other streaming states for those interfaces. We can see that the interface StreamingFIFO_0_out has (V)alid = 1 (R)eady = 0 for 120 cycles, which is 0.014 (1.4%) of the time in this example.
3 replies
Justin Hai
@JKHHai
Hi everyone,
I'm new to using FINN, and I've gone through the tutorials and come back with some conceptual questions:
  1. In the end-2-end tutorial, we insert a TLASTMarker node into our child model before performing synthesis. What does this TLASTMarker node do, and why is it required for DMA access?
  2. In the tutorial it's mentioned that the ReplaceVerilogRelPath transform is required to prevent errors further on. What sorts of errors might be observed if this transform is not applied?
  3. I'm reading the paper on FINN published at https://arxiv.org/pdf/1612.07119.pdf, and I'd like to learn a bit more about the SIMD lanes that are talked about in the MVTU. If I set my PE parameter to be 10 and my SIMD parameter for an FC layer to be 20, does this mean that each of the 10 PE's in the MVTU will have 20 SIMD lanes? And does that mean that each PE can perform 10 computations in parallel, for 200 operations total?
    Thank you,
    Justin
6 replies
dlphil
@dlphil

@maltanar thank you for all your rapid assistance with questions on this thread. I'm encountering the same error as this issue raised on GitHub, https://github.com/Xilinx/finn/issues/46#issuecomment-592702253, from back in March. When I run "sudo python3 driver.py", as per your suggestion, I encounter the following error:

Traceback (most recent call last):
File "driver.py", line 8, in <module>
from finn.util.data_packing import (
File "/home/xilinx/finn/pynq_deployment_vunnrz1d/finn/util/data_packing.py", line 34, in <module>
from bitstring import BitArray
ModuleNotFoundError: No module named 'bitstring'

This occurs during the execute_onnex second-to-last step of the end-to-end cnv notebook. The exact error is as follows, and, indeed, I am unable to find an output.npy anywhere in my environment:

FileNotFoundError: [Errno 2] No such file or directory: '/tmp/finn_dev_xilinx/pynq_deployment_vs019vmj/output.npy'

Would you know which stage generates output.npy? I am picking up the deployment stage after having completed the prior stages a couple of days ago. Would I need to redo the network preparation and/or synthesis steps, or should I be able to take up at "5. Deployment and Remote Execution" with the saved end2end_cnv_w1a1_synth.onnx model? Many thanks!

7 replies
Justin Hai
@JKHHai
Hi everyone, I have a question about simulating ONNX models using CPPSim: Can we simulate ONNX models with HLS layers using CPPSim at any point in the end-to-end flow, or do we have to insert FIFO's, DWC's, or setup PE and SIMD folding factors first?
Thanks,
Justin
6 replies
dlphil
@dlphil
For the deployment-to-board step, executing the non-synthesizeable nodes from the parent graph seems to require files that are stored in /tmp. As I am encountering local machine performance restrictions, I wish to break the flow and perform synthesis using offsite, cloud resources and my local machine for deployment to the PYNQ-Z1. With the .tcl, .hwh and .bit files, I believe I can program the overlay onto the board, but I'm struggling with how to modify the flow such that I preserve the non-synthesizable components. Any guidance on a complete list of necessary files and procedures that need to be packaged in order to deploy a design outside of a one-passthrough, end-to-end flow would be very much appreciated. Thank you!
17 replies
Hendrik Borras
@HenniOVP
Hi, just a quick general question: Is the name FINN an acronym of some sort? Or is it just the name as is?
1 reply
Quentin Ducasse
@QDucasse

Hello,
I am currently trying to reproduce the end to end flow with a Mobilenet. I followed the first steps as given in test_brevitas_mobilenet.py. The flow works perfectly until the HLS conversion part that I tried to add myself. Once the model is streamlined, I tried to run the following transformations:

model = model.transform(to_hls.InferQuantizedStreamingFCLayer("decoupled"))
model = model.transform(to_hls.InferConvInpGen())

However, I get hit by:

  File "/workspace/finn/src/finn/transformation/fpgadataflow/convert_to_hls_layers.py", line 48, in apply
    if n.op_type == "Im2Col" and not getCustomOp(n).get_nodeattr("dw"):
  File "/workspace/finn/src/finn/custom_op/__init__.py", line 62, in get_nodeattr
    raise AttributeError("Op has no such attribute: " + name)
AttributeError: Op has no such attribute: dw

I believe this check is done to prevent the transformation to touch the couples of depthwise/pointwise convolution but I am now stuck in the flow. I may be missing a transformation to annotate those particular convolutions.

And in another hand, do you have any good practices to set the different folding parameters? In terms of PE, SIMD and FIFO depths?

Thanks for all the help I got so far!

16 replies
dlphil
@dlphil

Regarding the CNV-W2A2 variant in the development branch, I am following the flow in test_end2end_cnv_w2a2.py and integrated the changes into my model's notebook. At the IP stitching step, though, I continually encounter a missing TLastMarker assertion error, even though I am running the InsertTLastMarker transform, as per the test example.

AssertionError: Last node is not TLastMarker.
Please run transformation InsertTLastMarker to ensure a valid
TLast signal

Looking through the insert_tlastmarker.py code, I am unable to determine why TLastMarker is not being applied. The main difference between the test example and my model is that I am training the 2-bit CNV with a custom, 3 class dataset. To accommodate the PE division, I changed the last folding factor to
(3, 1, 6, "distributed")
but the remaining transform code remains the same as that in the example.
Is this potentially a known issue? Any insight on how to proceed with the debug would be greatly appreciated. Thanks!

4 replies
Yaman Umuroglu
@maltanar
Heads-up for any FINN devs that want to take advantage of parallel unit/integration testing: since we now have over 2000 tests it was about time to put some parallel testing in place. This is already applied for the GitHub actions (non-Vivado non-slow tests) so you don't need to do anything there, although the speedup there is minimal since we get only 2 cores. For running tests in parallel on your local machine, you can use the following:
  • for rtlsim tests use pytest-parallel e.g. python setup.py test --addopts "-k rtlsim --workers auto"
  • for all other tests use pytest-xdist, make sure to add --dist=loadfile if you have tests in the same file that have dependencies on each other e.g. python setup.py test --addopts "-k mytest -n auto --dist=loadfile
Yaman Umuroglu
@maltanar
I just uploaded a small slide deck that explains some of the nitty-gritty details of how we do folding/time multiplexing in FINN; e.g. how PE and SIMD relate to matrix sizes. You can find it here: https://github.com/Xilinx/finn/blob/dev/docs/finn-sheduling-and-folding.pptx
3 replies
Yaman Umuroglu
@maltanar
I recently gave a talk at a webinar on FINN and the upcoming member of the FINN family called LogicNets if you want to check it out: https://vimeo.com/showcase/7255358/video/436376922
Yaman Umuroglu
@maltanar
Screenshot from 2020-07-11 01-41-56.png
1 reply
Coming soon: export graphs from Brevitas with special DebugMarker nodes (like above) and PyTorch forward hooks to compare intermediate activations between the Brevitas version and FINN-ONNX exported version. This is handy for debugging especially larger networks when they don't export correctly.
Quentin Ducasse
@QDucasse

Hello,
I am currently running the Brevitas/FINN workflow with the TFC network but I am using weights and activations going from 2 to 32 bits.
I am stuck in the process because the bitfile cannot be generated. When opening the resizer.xpr project in Vivado, the following implementation error is displayed:

[DRC UTLZ-1] Resource utilization: CARRY4 over-utilized in Top
 Level Design (This design requires more CARRY4 cells than are
 available in the target device. This design requires 14050 of 
such cell types but only 13300 compatible sites are available in
 the target device. Please analyze your synthesis results and 
constraints to ensure the design is mapped to Xilinx primitives as
 expected. If so, please consider targeting a larger device.)

I guess I am using too much resources for my application. What do you suggest would be the best idea to make it work? Correct and reduce the folding? Or is there any existing optimisation that I can use?

17 replies
Julio Cesar de Azeredo
@jcazeredo

Hello, I'm using Brevitas and FINN but I'm having an issue that I couldn't find an answer. I need to export my model to ONNX and I'm trying to useexport_finn_onnx, but I can't find where it is located. When I try to import brevitas.onnx as bo I get an error ModuleNotFoundError: No module named 'brevitas.onnx'. I tried to search in all files through the repository, but I didn't find this function. I need it because I want to export my model to ONNX and use it with FINN.

Does anyone know what I need to do to use this function?

10 replies
juansuzano
@juansuzano

Does FINN already support the implementation of non-binarized neural networks? for example TFC_w2_a2?

I'm trying to implement this network using the tfc_w1_aq_end2end_example notebook as a basis but I'm getting some errors.

Thanks in advance

9 replies
syedsauda
@syedsauda

I've tried the end2end examples in both the dev and master branch but keep getting this File not Found error https://pastebin.com/u6cNWDQ2 when reaching the HLSSynthIP() command. Vivado is being recognised and hls logs are generated in the tmp directory.

Is my installation the problem or a problem with the version of the libraries intalled?

11 replies