Open access peer-reviewed chapter

Reconfigurable Platform Pre-Processing MAC Unit Design: For Image Processing Core Architecture in Restoration Applications

Written By

G.N. Chiranjeevi and Subhash Kulkarni

Submitted: 03 May 2022 Reviewed: 16 September 2022 Published: 08 February 2023

DOI: 10.5772/intechopen.108139

From the Edited Volume

Latest Advances and New Visions of Ontology in Information Science

Edited by Morteza SaberiKamarposhti and Mahdi Sahlabadi

Chapter metrics overview

54 Chapter Downloads

View Full Metrics

Abstract

The overwhelming majority of image processing algorithms are two-dimensional (2D) and, as a result, their scope is limited. As a result, the 2D convolution function has important implications for image processing needs. The 2D convolution and MAC design processes are used to perform image analysis tasks such as image blurring, softening, feature extraction, and image classification. This study’s primary goal is to develop a more efficient MAC control block-based architectural style for two-dimensional convolutions. In image processing applications, convolution deployment, the recommended 2D convolution architectural methodology, is significantly faster and requires far fewer hardware resources. The resulting convolution values are stored in memory when the convolution procedure is completed.

Keywords

  • pre-processing block
  • multi-byte fetching
  • boundary padding
  • block memory access
  • Kernel architecture 2D convolution
  • MAC
  • image processing FPGA

1. Introduction

Two-dimensional (2D) convolution is used in image and video processing applications to cover a wide range of applications, particularly image smoothing, image sharpening, edge detection, and a few image restoration techniques. All of these platforms require the use of a 2D convolution block in digital image processing. One of the most important procedures in image processing applications is the MAC design procedure. The implementation of a 2D convolution framework involves several steps, and each pixel value has its own meaning. The first step in creating a picture from any input is to convert it into image pixels and store them in the matrix. The user-selected MAC operations and matrices are then executed.

When given specific types of data as inputs, image processing algorithms/architecture [1, 2] perform best the majority of the time. In the vast majority of cases, however, the input image fails to meet critical requirements. Preprocessing occurs prior to application-specific processing [3]. The storage of images is a significant issue in image processing. Over the years, many image file formats have been developed with the goal of representing images in a streamlined and premium manner that can be used on a variety of platforms [4]. Different images of the same type can have a different scale of signal intensities based on preprocessing.

Advertisement

2. Image processing blocks in preprocessing blocks

Preprocessing systems are [5, 6, 7, 8, 9, 10] used to prepare data for use in the analysis. The next block of data is now available for optimal use. The main purpose of preprocessing is to improve the quality of the image so that it can be analyzed correctly [11, 12]. Preprocessing can remove unnecessary distortions from a data set and sometimes improve the features that are important for the application you are working with. Some of the features that can be used to improve images depend on the application being used (e.g., general image processing, image enhancement, or image analysis) [13, 14, 15, 16, 17, 18, 19], so they can be effective for use by other types of algorithms.

One of the most distinguishing features of the architects who create images based on Vedic computing is the immense amount of information required to create them. To display a grayscale image at a reasonable resolution, 2106 bits are required. Digital images can be stored and transmitted using XSG blocks, but they need to be compressed and edge-detected first. This is done with image compression and edge detection hardware.

In FPGA preprocessing subsystems (DSP modules) [20], protocols are transformed from standard software-compatible representations [21] to more hardware-compatible representations that exploit data parallelization [22, 23] in certain hardware architectures. Signal and Video Processing Framework [24], which often deviates from conventional von Neumann models such as Dataflow [25].

Advertisement

3. Block hardware architecture for preprocessing

This architecture is designed to make image processing easier by prepping data and preparing images which is shown in Figure 1. The strategy proposed in this research is organized into five phases: (1) data collection and analysis, (2) model development, (3) testing and validation, (4) implementation, and (5) evaluation. In the data collection and analysis phase, researchers will collect data from a sample of participants to create a database of information. In the model development phase, the researchers will develop a model to explain the relationship between variables. In the testing and validation phase, the model will be tested against data from a different sample of participants to see if it is accurate. The implementation phase will involve the implementation of the model on a large scale, and the evaluation phase will involve assessing the effectiveness of the model. This chapter introduces a preprocessing method for images, as well as an algorithm for automatically screening images using the preprocessing method.

Figure 1.

Architecture of the work.

Advertisement

4. Different modes of operations for block memory preprocessing

Two constraints must be considered when designing preprocessing block memory from the user’s perspective. The first is to write one pixel value of one byte into the target device’s memory at each clock cycle. The representation of the input image (e.g., 512*512) is the second factor to consider. The main goal of this block memory is to have the ability to change the size of the kernel during read and write operations.

When compared to the inbuilt IP core model, the ability to choose the kernel size during read operations is seen as a benefit. During read operations, the IP core can access data in terms of 2 powers bits (i.e., 2, 4, 8, 16, 32, 64), but in the proposed design, it can access data based on kernel requirements and not reserved to any specific values. The observational and experimental during read operation is a general 8 bit pixel value * kernel size.

4.1 Write mode

With during the clock cycle, one pixel of data is written into memory at the address pointed to by the address pointer.

4.2 Read mode

To circumvent the FIFO paradigm, the proposed hardware architecture activates read operations “N” times, depending on the needs of the user. The iteration of the read operation is determined by the kernel size. If the kernel is 3 by 3, for example, the values from three adjacent positions are read. This will determine whether there is sufficient data for IP core architecture. The memory hardware location from which we will read must be specified by the user. That is, the read output is extracted from the pre-processing block hardware as a concatenation of three pixel data values highlighted by the read pointer. Finally, data was accessed from all three locations at the same time.

4.3 Additional user choice

Image padding adds new pixels around the edges of an image. When advanced filtering methods are used, the border serves as a boundary or provides space for annotations. The three different research findings are used as user-preferred option inputs.

4.4 Duplicate mode

There are now two rows and two columns. The pixel values from the first row and the last row are copied for the new row that comes before the first row and after the last row, as shown in Figure 2. The first and last columns of the original image’s values are also copied to the new columns. After duplication, the image would be 514 by 514 due to the cascading of two additional rows and two additional columns.

Figure 2.

Duplicate mode: original image versus with padding extra row and column.

4.5 Zero padding

Figure 3 shows the addition of additional rows with all pixel values set to zero. Columns with a pixel value of 0 are also introduced.

Figure 3.

Zero padding modes: original image versus with padding extra row and column filled with zero.

4.6 Non-duplicate mode

In this case model, function with existing/ignore the boundaries for IP core operations. Ignore the edge pixel’s value and compute for those pixels that have all of their neighbors highlighted in Figure 4.

Figure 4.

Non-duplicate mode: remains same as original image (no additional row and column).

Advertisement

5. Experimental analysis of reconfigurable platform

Hardware layout strategies like parallelism and pipelining are viable on an FPGA, however now no longer in dedicated DSP modules. The use of reconfigurable hardware to put into effect photo processing algorithms lowers time-to-marketplace costs, permits speedy prototyping of complex algorithms, and makes debugging and verification easier. FPGAs are for that reason a terrific preference for real-time photo processing algorithms.

The preprocessing hardware design during this analysis paper is constructed victimization Verilog coding. The verification of practicality is performed on reconfigurable hardware using the style flowchart shown in Figure 5.

Figure 5.

Design flow for preprocessing architecture of the work.

The external view included in the 5 by 5 kernel shown in Figure 6 is this. In this case, the output is indicated as 40 bits. That is 5 times the 8-bit pixel value depicted in the above figure in one clock cycle. It has complete control over all external input and output pins.

Figure 6.

External view for preprocessing architecture of the work.

In digital circuit design shown in Figure 7, the register-transfer level (RTL) is a design abstraction for modeling a synchronous digital circuit in terms of the flow of digital signals (data) between hardware registers and the logical operations performed on those signals (Figure 7).

Figure 7.

RTL design for preprocessing architecture of the work.

Figure 8 illustrates that FPGAs can significantly speed up image processing applications, compared to software versions. Then, by employing our method, we will be able to reduce the number of operations in a real-time application, allowing us to include more complex algorithms.

Figure 8.

Utilization report for preprocessing architecture of the work.

Advertisement

6. Results and discussion

2D MAC Convolution is a general-purpose image filter effect that adds weighted values ​​of all the pixels around it to determine the value of the center pixel. The product of the 512 * 512 image matrix convolution multiplied by [(3 * 3), (5 * 5) ….. (K * K)]. is a newly modified, filtered image. Calculate the product of the pixels that overlap each other and their sum in each case and the result will be the value of the output pixel at that particular location.

Input pixels are written into memory and accessed using a read operation in Figure 9; the results shown are for a 3 by 3 matrix kernel. To put it another way, it is enabling the sequential read operation three times in a row.

Figure 9.

Simulation results for preprocessing architecture of the work.

Advertisement

7. Conclusions

In this study, the preprocessing technique facilitates IP core access to data and improves image quality by using different image processing techniques. Analysis and verification of the results are carried out using a standard reconfigurable platform (Zynq board), as well as an evaluation of the consistency of hardware usage (area). The purpose of this study was to concentrate on selecting the ideal memory operations, including multiple reads and writes, and to take into consideration the main user input, the core size. Not only does preprocessing reduce memory access times, but it also improves performance. As a result of this procedure, the data collected may be useful for advancing the research. It is a reconfigurable platform that can be deployed in a variety of ways.

Advertisement

Acknowledgments

This research is being carried out at the Research Center of Electronics and Communications at PESIT Bangalore South Campus (PES University), Bangalore, Karnataka, India. Which is Established in 1988.

Advertisement

Notes/thanks/other declarations

I would like thank my guide Dr. Subhash kulkarni for his contionous support and encouragement at each every stage of research.

References

  1. 1. Smith TF, Waterman MS. Identification of common molecular subsequence’s. Journal of Molecular Biology. 1981;147:195-197
  2. 2. Nikhil R. Blue spec System Verilog: Efficient, correct RTL from high level specifications. In: Proceedings of the Second ACM and IEEE International Conference on Formal Methods and Models for Co-Design (MEMOCODE’04). CA, USA, San Diego. 23-25 June 2004. pp. 69-70
  3. 3. Shoup R. Parameterized Convolution Filtering in a Field Programmable Gate Array Interval. Technical Report. Palo Alto, California; 1993
  4. 4. Neoh HS, Hazanchuk A. Adaptive edge detection for real-time video processing using FPGAs. In: 2004 GSPx Conference. 2004
  5. 5. Wang J, Zhong S, Yan L, Cao Z. An embedded system-on-chip architecture for real-time visual detection and matching. IEEE Transactions on Circuits and Systems for Video Technology. 2014;24:525-538
  6. 6. Mondal P, Biswal PK, Banerjee S. FPGA based accelerated 3D affine transform for real-time image processing applications. Computers and Electrical Engineering. 2016
  7. 7. Kadric E, Lakata D, Dehon A. Impact of parallelism and memory architecture on FPGA communication energy. ACM Transactions on Reconfigurable Technology and Systems. 2016;9:1-23
  8. 8. Pezzarossa L, Kristensen AT, Schoeberl M, Sparsø J. Using dynamic partial reconfiguration of FPGAs in real-time systems. Microprocessors and Microsystems. 2018;61:198-206
  9. 9. Stephen D. Brown RJ, Francis J, Rose ZG Vranesic. Field Programmable Gate Arrays. 1992
  10. 10. Hirai S, Zakouji M, Tsuboi T. Implementing image processing algorithms on FPGA-based realtime vision system. In: Proceedings in 11th Synthesis and System Integration of Mixed Information Technologies (SASIMI 2003). Hiroshima; Apr 2003. pp. 378-385
  11. 11. Torres-Huitzil C, Nuño-Maganda MA. Area time efficient implementation of local adaptive image thresholding in reconfigurable hardware. ACM SIGARCH Computer Architecture News. 2014;42:33-38
  12. 12. Sungheetha A, Sharma R. A novel CapsNet based image reconstruction and regression analysis. Journal of Innovative Image Processing (JIIP). 2020;2(03):156-164
  13. 13. Navabi Z. Digital Design and Implementation with Field Programmable Devices. Kluwer Academic Publishers; 2011
  14. 14. Lysaght P, Blodget B, Mason J, Young J, Bridgford B. Invited paper: Enhanced architectures, design methodologies and CAD tools for dynamic reconfiguration of Xilinx FPGAS. In: Proceedings of the International Conference on Field Programmable Logic and Applications (FPL ‘06). pp. 1-6
  15. 15. Chiranjeevi GN, Kulkarni S. Pipeline architecture for N=K*2L bit modular ALU: Case study between current generation computing and vedic computing. In: 6th International Conference for Convergence in Technology (I2CT). 2021. pp. 1-4. DOI: 10.1109/I2CT51068.2021.9417917
  16. 16. Durgakeri BS, Chiranjeevi GN. Implementing image processing algorithms using Xilinx system generator with real time constraints. In: 4th International Conference on Recent Trends on Electronics, Information, Communication & Technology (RTEICT). 2019. pp. 230-234. DOI: 10.1109/RTEICT46194.2019.9016962
  17. 17. Ravi S, Abdul Rahim B, Fahimuddin Shaik. FPGA based design and implementation of image edge detection using Xilinx system generator. International Journal of Engineering Trends and Technology (IJETT). Oct 2013:4(10):4657-4660
  18. 18. Raut P, Gokhale AV. FPGA implementation for image processing algorithms using Xilinx system generator. IOSR Journal of VLSI and Signal Processing; 2(4):26-36
  19. 19. Chaitanya Deepthi V, Surya Prasad P. Medical image fusion using Xilinx system generator in FPGA. International Journal of VLSI System Design and Communication Systems. Oct 2016;4(10):0990-0993
  20. 20. Chiranjeevi GN, Kulkarni S. Validation of the FPGA-based image processing techniques using the efficient tool like Xilinx device generators. International Journal of Emerging Trends in Engineering Research. Apr 2021;9(4)
  21. 21. Li C, Rui H, Zhaohua D, Chris GJ, Dimitris NMand John CG. A level set method for image segmentation in the presence of intensity in homogeneities with application to MRI. IEEE Transactions on Image Processing. 2011;20:2007-2016
  22. 22. Ankita B, Kalyani M. Fuzzy-based artifcial bee colony optimization for gray image segmentation. Signal, Image and Video Processing. 2016;10:1089-1096
  23. 23. Qingyi L, Mingyan J, Peirui B, Guang Y. A novel level set model with automated initialization and controlling parameters for medical image segmentation. Computerized Medical Imaging and Graphics. 2016;48:21-29
  24. 24. Farid MS, Lucenteforte M, Grangetto M. DOST: A distributed object segmentation tool. Multimedia Tools and Applications. 2018;77:20839-20862
  25. 25. Qureshi MN, Ahamad MV. An improved method for image segmentation using K-means clustering with neutrosophic logic. Procedia Computer Science. 2018;132:534-540

Written By

G.N. Chiranjeevi and Subhash Kulkarni

Submitted: 03 May 2022 Reviewed: 16 September 2022 Published: 08 February 2023