Logo Passei Direto
Buscar
Material
páginas com resultados encontrados.
páginas com resultados encontrados.

Prévia do material em texto

Intelligent DC Series Arc Fault Detection using Deep Learning
in Photovoltaic Systems
Author:
Lu, Shibo
Publication Date:
2021
DOI:
https://doi.org/10.26190/unsworks/2236
License:
https://creativecommons.org/licenses/by-nc-nd/3.0/au/
Link to license to see what you are allowed to do with this resource.
Downloaded from http://hdl.handle.net/1959.4/70737 in https://
unsworks.unsw.edu.au on 2024-02-08
http://dx.doi.org/https://doi.org/10.26190/unsworks/2236
https://creativecommons.org/licenses/by-nc-nd/3.0/au/
http://hdl.handle.net/1959.4/70737
https://unsworks.unsw.edu.au
https://unsworks.unsw.edu.au
INTELLIGENT DC SERIES ARC FAULT 
DETECTION USING DEEP LEARNING IN 
PHOTOVOLTAIC SYSTEMS 
Shibo LU 
Supervisor: A/Prof. Toan PHUNG 
Secondary supervisor: Dr. Daming ZHANG 
 
A thesis in fulfilment of the requirements for the degree of 
Doctor of Philosophy 
 
School of Electrical Engineering & Telecommunications 
Faculty of Engineering 
University of New South Wales 
November 2020
 
 
Surname/Family Name : LU 
Given Name/s : Shibo 
Abbreviation for degree as give in the University calendar : Ph. D. 
Faculty : Engineering 
School : School of Electrical Engineering and Telecommunications 
Thesis Title : Intelligent DC Series Arc Fault Detection Using Deep Learning in Photovoltaic Systems 
 
Abstract 350 words maximum: 
 Grid integration of renewable sources including solar energy is growing faster than ever before. Nowadays, solar 
power development is increasing throughout the world, and solar photovoltaic (PV) systems play an important role to 
support the main loads and micro-grids. However, one needs to consider the long-term performance of PV components. 
Their deterioration can be caused by various factors such as ageing, weathering, the higher DC operating voltage level, 
improper installation, inadequate maintenance, etc. The consequence is a growing potential of electrical arcing incidents 
especially the series arc fault in PV systems. Without timely detection and interruption, such dangerous events can cause 
catastrophic fires, posing a severe threat to human safety and properties. 
In this thesis, a comprehensive review of DC arc fault and their diagnosis methods in PV systems is presented. 
Experimental study of DC series arc fault characteristics is carried out. The feasibility of applying deep learning (DL) in 
series arc fault detection in PV systems is systematically investigated. Specifically, convolutional neural networks (CNN) 
are successfully applied and demonstrate superior diagnosis performance over conventional machine learning algorithms 
and other popular DL algorithms. For cost-effective real-time deployment, a lightweight CNN structure is designed to 
achieve a good balance between model complexity and detection accuracy. Moreover, novel frameworks, including 
domain adaptation and deep convolutional generative adversarial network (DA-DCGAN) and lightweight transfer 
convolutional neural network with adversarial data augmentation (LTCNN-ADA), are proposed. They aim to address the 
challenges when applying DL to practical applications, including lack of fault data from the field, data inconsistency 
between laboratory and field, and limited computation resources in edge devices. The proposed methods are validated 
through comprehensive offline analysis using pre-recorded data. In addition, the trained DL classification models are 
deployed in an embedded system and tested in single-phase and three-phase PV systems in real-time under different test 
conditions. Both offline and online experimental results show that the proposed methods can accurately and reliably 
detect series arc fault in PV systems. 
 
Declaration relating to disposition of project thesis/dissertation 
 
I hereby grant to the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole 
or in part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain 
all property rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or 
dissertation. 
 
I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstracts International (this is applicable to 
doctoral theses only). 
 
…………………………………………………………… 
 Signature 
 
……………………………………..……………… 
 Witness Signature 
 
……….……………………...…….… 
 Date 
The University recognises that there may be exceptional circumstances requiring restrictions on copying or conditions on use. Requests 
for restriction for a period of up to 2 years must be made in writing. Requests for a longer period of restriction may be considered in 
exceptional circumstances and require the approval of the Dean of Graduate Research. 
 
FOR OFFICE USE ONLY Date of completion of requirements for Award: 
 
I 
 
 
 
 
Originality Statement 
I hereby declare that this submission is my own work and to the best of my knowledge it 
contains no materials previously published or written by another person, or substantial 
proportions of material which have been accepted for the award of any other degree or diploma 
at UNSW or any other educational institution, except where due acknowledgement is made in 
the thesis. Any contribution made to the research by others, with whom I have worked at UNSW 
or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content 
of this thesis is the product of my own work, except to the extent that assistance from others in 
the project's design and conception or in style, presentation and linguistic expression is 
acknowledged. 
 
Signed………………………………. 
Date…………………… 
 
 
II 
 
 
 
 
COPYRIGHT STATEMENT 
 I hereby grant the University of New South Wales or its agents the right to archive and to make 
available my thesis or dissertation in whole or part in the University libraries in all forms of media, 
now or here after known, subject to the provisions of the Copyright Act 1968. I retain all 
proprietary rights, such as patent rights. I also retain the right to use in future works (such as 
articles or books) all or part of this thesis or dissertation. 
I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation 
Abstract International (this is applicable to doctoral theses only). I have either used no 
substantial portions of copyright material in my thesis or I have obtained permission to use 
copyright material; where permission has not been granted, I have applied/will apply for a partial 
restriction of the digital copy of my thesis or dissertation. 
 
Signed .............................................................................. 
 
Date .............................................................. 
 
AUTHENTICITY STATEMENT 
I certify that the Library deposit digital copy is a direct equivalent of the final officially approved 
version of my thesis. No emendation of content has occurred and if there are any minor 
variations in formatting, they are the result of the conversion to digital format. 
 
Signed .............................................................................. 
 
Date................................................................ 
 
 
III 
 
 
 
 
 
 
 
 
 
 
A THESIS DEDICATED TO MY PARENTS. 
 
IV 
 
Acknowledgement 
It has been a long and hard journey towards a Ph. D., and I have finally reached the 
end. 
I would like to express my deepest gratitude to my primary supervisor, A/Prof. 
Toan Phung. Without your expert guidance, insightful suggestions, and encouragements 
during the past few years, the accomplishment of this thesis would not be possible. 
Your inspiring minds and firm support have undoubtedly taken my research to another 
level.Arc Model 
Cassie and Mayr arc models are the most popular and widely used, developed 
based on the principle of energy conversion. The Cassie arc model can be expressed as 
below: 
1
𝑔
𝑑𝑔
𝑑𝑡
=
1
𝜏
(
𝑢𝑎𝑟𝑐𝑖𝑎𝑟𝑐
𝑉𝑐
2
− 1) (2.2) 
𝑔 =
𝑖𝑎𝑟𝑐
𝑢𝑎𝑟𝑐
 (2.3) 
where 𝑔 denotes the arc conductance, 𝑢𝑎𝑟𝑐 is the arc voltage, 𝑖𝑎𝑟𝑐 is the arc current, 𝑉𝑐 
is constant arc voltage, and 𝜏 is the arc time constant. In this model, Cassie assumed the 
power loss is caused by forced convection, which means the area of arc cross section is 
proportional to the arc current; thus, Cassie arc model is better for simulation of high 
arc current level. On the contrary, the Myer arc model is more suitable for low current 
arcs, because Myer assumed the power loss is caused by thermal conduction and it 
remains constant [45]. The equation is shown below: 
1
𝑔
𝑑𝑔
𝑑𝑡
=
1
𝜏
(
𝑖𝑎𝑟𝑐
2
𝑃
− 1) (2.4) 
where 𝑔 denotes the arc conductance, 𝑖𝑎𝑟𝑐 is the arc current, 𝑃 is the static cooling 
power, and 𝜏 is the arc time constant determined empirically. The unknown constant 𝑃 
in (2.4) can be calculated and determined through observation of the experimental 
results [46]. There are also many other physics-based arc models that can potentially be 
used for PV application, such as Lowke’s model and modified Mayr model [42], [47]. 
However, those models involve more parameters, which are not easy to be implemented 
in the simulation. 
 
 
20 
2.3.2. V-I Characteristic-based Arc Model 
From 1902 till now, many empirical models such as Hertha Ayrton model, 
Steinmetz model, Van and Warrington model, Paukert model, etc. have been proposed 
[43]. Some of the more popular models are described in the following. 
2.3.2.1. Nottingham Arc Model 
Nottingham developed an arc equation as shown in (2.5): 
𝑉𝑎𝑟𝑐 = 𝐴 +
𝐵
𝐼𝑎𝑟𝑐
𝑛 (2.5) 
where the value of 𝐴, 𝐵 , and 𝑛 depends on the arc length and type of electrode material. 
It covers the arc length between 1 mm to 10 mm (0.0394 in. to 0.394 in.), and current 
level up to 10 A. When the electrode material is copper, which is the most common 
material used in PV systems, the parameters of the equation are: 𝐴 = 27.5, 𝐵 = 44, and 
𝑛 = 0.67 [48]. This equation is only suitable to simulate series arc fault at below string 
level because of the limited range of current and arc length. 
2.3.2.2. Hall, Myer, and Viicheck Arc Model 
Nottingham’s arc model only covers low current levels because of the lack of high 
power DC source in early years. Hall et al. carried out arc testing with high current level 
ranged from 300-2400 A, and air gap widths ranged from 4.8-152 mm in 1978 [49]. It 
was found that the experiment results match the estimation proposed by Nottingham in 
(2.5). Hence, it is capable of simulating series arc faults with high current level (i.e. at 
the DC side of the solar inverter in a large PV system) and parallel arc faults. 
 
 
 
21 
2.3.2.3. Stokes and Oppenlander Arc Model 
Stokes and Oppenlander Model was developed in the most exhaustive way among 
other existing models as it covers arc current range from 0.1 A to 20 kA and air gap 
widths from 5 mm to 500 mm with electrodes in series [41]. It is widely used in 
incident-energy estimation for DC arc fault in industrial applications [50]. The V-I 
characteristic of arc current levels above the transition point line (constant voltage 
region) is shown below: 
𝑉𝑎𝑟𝑐 = (20 + 0.534𝐿)𝐼𝑎𝑟𝑐
0.12 (2.6) 
𝐼𝑡𝑟𝑎𝑛𝑠𝑖𝑡𝑖𝑜𝑛,𝑎𝑟𝑐 = 10 + 0.2𝐿 (2.7) 
where 𝐿 denotes the air gap width in mm. This arc model is very suitable for high 
current arc simulation such as the series arc fault in the combiner box and solar inverter, 
and the parallel arc fault between two strings. 
2.3.2.4. Paukert Arc Model 
In order to cover both “constant power” region and “constant voltage” region, 
Paukert has developed a model that covers arc current range from 0.3 A to 100 kA and 
air gap widths range from 1 mm to 200 mm as follows: 
𝑉𝑎𝑟𝑐 = {
𝐴
𝐼𝑎𝑟𝑐
𝐵 , 𝐼 ≤ 100𝐴
 
𝐴𝐼𝑎𝑟𝑐
𝐵 , 𝐼 > 100𝐴
 (2.8) 
where A and B are positive arc constants, varying different gap width [51]. Because of 
its wide current range, this arc model can be used to simulate both series and parallel 
arc faults in PV systems at different levels with the fixed gap width. However, it is quite 
difficult to be implemented when gap width changes continuously as it is not expressly 
defining the gap width. 
 
22 
2.3.2.5. Modified Paukert Arc Model 
In [52], to further reveal the dependency between the V-I curve and gap width, a 
modified Paukert arc model with gap width integrated in both arc constants A and B has 
been proposed for low-current series arc: 
𝑉𝑎𝑟𝑐 =
𝐴 + 𝐶𝐿
𝐼𝑎𝑟𝑐
𝐵+𝐷𝐿 (2.9) 
where A, B, C, and D are arc constants dependent on experimental condition, and L is 
the gap width in inch. This model is proposed for modern DC power system with 
several hundred volts of operating voltage, which is often the case for current PV 
systems. Because of the low current and small gap width, it is very suitable for the 
simulation of the series arc fault at below PV string level. 
2.3.3. Heuristic Arc Model 
The Heuristic model is also based on experimental observation. However, it has 
additional ad-hoc-parameter to improve the correlation between simulation and 
experimental results. In [53], a practical series DC arc model independent of electrical 
time constants has been developed based on a large set of experimental data. The arc 
voltage can be represented by the sum of two hyperbolic approximated equations: a 
nonlinear voltage component (the arc column voltage) and an electromagnetic force 
component (the anode and cathode voltage) as follows: 
𝑉𝑛 = 𝑉𝑑𝑐,𝑠𝑜𝑢𝑟𝑐𝑒(0.5 + 0.5tanh (𝛼(𝑞 − 1))) (2.10) 
𝑒𝑔𝑎𝑝 = 0.5(𝑎 + 𝑏𝑥𝑔𝑎𝑝)(tanh(𝜆𝑞) − tanh λ(𝑞 − 1)) (2.11) 
𝑞 =
𝑥𝑔𝑎𝑝
𝑥𝑒𝑥𝑡𝑖𝑛𝑐𝑡𝑖𝑜𝑛
 (2.12) 
where 𝑞 represents the separation ratio (jitter), 𝑥𝑔𝑎𝑝is the air gap width, 𝑥𝑒𝑥𝑡𝑖𝑛𝑐𝑡𝑖𝑜𝑛 is 
the critical distance between two electrodes when arc will quench, parameters 𝑎, 𝑏, 𝛼, 
 
23 
and 𝜆 are relevant to offset of 𝑒𝑔𝑎𝑝, voltage gradient of 𝑒𝑔𝑎𝑝, slope of nonlinear voltage 
𝑉𝑞, and slope of 𝑒𝑔𝑎𝑝, respectively. An arc will be established when 𝑞 = 0+, while it 
will be extinct when 𝑞 = 1 . Obviously, 𝑥𝑔𝑎𝑝 can be determined by measurement 
whereas 𝑥𝑒𝑥𝑡𝑖𝑛𝑐𝑡𝑖𝑜𝑛 cannot be precisely measured because it is a random variable. Thus, 
𝑥𝑒𝑥𝑡𝑖𝑛𝑐𝑡𝑖𝑜𝑛 could represent the random behaviors of simulated arc in terms of 𝑞, which 
is given by: 
𝑞|𝑡=𝑘𝑇 = 𝑞|𝑡=𝑘𝑇−1 + 𝑟𝑎𝑛𝑑(𝑐)10 (2.13) 
where 𝑘 ∈ 𝑍, 𝑇 is the current simulation time step, and 𝑟𝑎𝑛𝑑(𝑐) is a random number 
between 0 and 𝑐. Note that 𝑞 will be set to a new random number when 𝑞 > 𝑞𝑡ℎ (𝑞𝑡ℎThe high-frequency variation is closely relevant to the 
discharge of the plasma because of the chaotic nature of the arc. Therefore, it is 
important to account those factors in the arc noise modelling. 
In order to describe the arc random behaviour, zero-mean Gaussian noise could be 
added into the arc signal [55], [56]. The high-frequency part in the arc fault signal can 
be fitted into the equation as shown in (2.14). 
𝑓(𝑥) =
1
√2𝜋𝜎
exp (−
(𝑥 − 𝜇)2
2𝜎2
) (2.14) 
where 𝜇 is the mean, and 𝜎 is the variance of the noise. The value of 𝜇 is assumed to be 
0 in the noise modelling. It is found that the arc noise in PV systems can be 
characterised by the pink noise (1/𝑓 noise), where the magnitude has an opposite trend 
of the frequency [16], [57]. In [58], pink noise is used to simulate the AC component of 
the arc fault. In [59], a modified pink noise model is introduced to better simulate the 
arc faults in different physical states. It is also useful for simulations involving analysis 
of the propagations of arc fault signal in wires and devices that attenuate noise. Based 
on evaluations, the simulated signals match the experimental signals in the frequency 
domain. In [53], an additional equation is established to represent the random 
behaviours of the arc as mentioned earlier. 
 
25 
Table 2.1 Summary of DC arc fault models for simulation 
Arc model Current level 
Gap width 
(mm) 
Series arcing? 
Parallel 
arcing? 
High frequency 
variation 
Myer Low Depends 
Yes, at below string 
level 
No 
Gaussian/Pink 
noise 
Cassie High Depends Yes, at array level Yes 
Gaussian/Pink 
noise 
Nottingham 
Low (below 
10 A) 
1-10 
Yes, at below string 
level 
No 
Gaussian/Pink 
noise 
Hall, Myer, 
and Vilicheck 
High (300 A 
to 2400 A) 
4.8-152 Yes, at array level Yes 
Gaussian/Pink 
noise 
Stokes and 
Oppenlander 
High (Above 
transition line) 
5-500 (5, 20, 
100, 500) 
Yes, at array level Yes 
Gaussian/Pink 
noise 
Paukert 
Low/High (0.3 
A-100 kA) 
1-200 Yes Yes 
Gaussian/Pink 
noise 
Modified 
Paukert 
Low (below 
25 A) 
1-3 
Yes, at below string 
level or at combiner 
box with small 
current 
No 
Gaussian/Pink 
noise 
Heuristic arc 
model [53] 
Low/High Depends Yes No 
Additional 
equation 
Heuristic arc 
model [54] 
Low/High Depends Yes Possible 
Additional 
equation 
 
2.4. DC Arc Faults Detection Methods in PV Systems 
AC arc fault recognition and detection have been widely researched for a long time, 
while DC arc fault is far less developed [60]. With the release of the outline and 
standard related to DC arc fault protection in PV system in 2013 and 2018, respectively, 
the demand for effective DC arc fault detection algorithms and products is rapidly 
increasing [14], [15], [61]. 
Besides PV-specified detection methods, some detection methods for other DC 
systems such as electrical vehicle and DC microgrids will be reviewed in this section as 
well, because they can be adopted in PV systems without or with some minor 
modifications. 
 
26 
2.4.1. Sensors for Measurement 
There are several types of signals available for arc fault detection purpose in 
different DC system applications. Current is more commonly used than other signals 
such as voltage and electromagnetic signals. Voltage is less popular mainly because the 
fault locations are unknown, and the arc voltage cannot be directly measured. 
Furthermore, many voltage sensors are typically required than current sensors in order 
to cover the entire system [20]. Radio frequency sensors capture the induced 
electromagnetic wave when series arc fault occurs. In the higher frequency band, there 
are less impacts from the common interferences such as power electronics noise. 
However, the electromagnetic signals can be attenuated by physical obstructions 
between the fault source and the sensors. For example, in practical conditions, a series 
arc fault can occur in MC4 connectors at the backside of the PV panel. In this case, the 
PV panel itself can be the obstacle between the arc fault and the sensors. Other sensors 
such as optical, acoustic, and thermal imaging sensors also have potential for series arc 
fault detection. However, similar to radio frequency sensors, their detection ranges are 
generally limited. To apply those sensors in large PV applications, studies of optimal 
sensor locations are required. Therefore, the measured system current signals are used 
for the majority of detection methods. 
2.4.2. Fast Fourier Transform 
Fourier analysis is a classic approach to frequency domain analysis. It is found that 
the noise floor will increase after arc fault occurrence [33], [57], [62]–[65]. J. Johnson 
et al. from Sandia Laboratory carried out a series of tests related to DC arc faults in PV 
systems, and the frequency band of 1-100 kHz is recommended for fault detection [35], 
[57], [66]. 
 
27 
Fast Fourier transform (FFT) algorithm is convenient to implement since it is 
generally available for majority of software libraries. The detection methods based on 
FFT can be found in [61], [63], [64], [67]–[76] . Most methods compare the amplitude 
or power of the frequency spectrum between the arcing state and non-arcing state (or a 
pre-defined threshold value calculated based on the frequency spectrum of non-arcing 
state) to make decision. In [67], the sum of frequency spectrum of the CT current in the 
frequency band of 0-12.5 kHz is employed using the sampling frequency of 250 kHz. In 
[61], [68], [69], the contents in frequency spectrum of the CT current within the 
frequency band of 40-100 kHz are used for detecting series or parallel arcs under the 
sampling frequency of 250 kHz. A detailed design of a DC series AFD based on a well-
known cost-effective TMS320F28335 digital signal processor can be found in [69]. The 
hardware structure in [69] is illustrated in Figure 2.6. 
 
CPU Timer 0
32 bit CPU
16 bit
Peripheral Bus
GPIO MUX SRAM JTAG SPI
DSP: Implement series arc fault identification technique
16 bit 
ADC
8
th
 Order Active 
Band-pass Filter
Current Sensor
Trip Signal Output
Current Signal Input
Power SupplyDC 7 V~12 V
CT: PA3655NL
Op Amp: PA3655NL
SM73307 SM73201
TMS320F28335
L78L05A for 5 V Power Supply
LD1117A25 for 2.5 V Power Supply
LD1117A33 for 3.3 V Power Supply
 
Figure 2.6 An example of hardware structure of the DC series AFD [72] 
 
28 
Similarly, the 40-80 kHz and 30-100 kHz frequency band of CT current is used 
under the sampling frequency of 200 kHz in [70] and [71], respectively. However, those 
kinds of methods may fail when other electromagnetic interference shows up [37], [38]. 
In [72], instead of comparing the whole spectrum, it separates the frequency 
spectrum into different sub-bands. In this way, the accuracy increases substantially. In 
[73], Kanemaru et al. apply a sorting method to mitigate the impact from switching 
noise associated with power electronic devices. The amplitude of each frequency bin 
within 10-100 kHz frequency band in the frequency spectrum of the CT current (500 
kHz sampling frequency) is sorted in the ascending order. It is known that the overall 
noise intensity in signals increases when an arc occurs. Furthermore, it is reported in 
some case studies that the amplitude of switching noise is bigger than that of arc noise 
in certain frequency band [62]. Therefore, when switching noise and its harmonics have 
a greater noise intensity than arc noise, the switching noise components concentrate 
toward the higher sort number. By only considering those frequency contents in the 
region of small sort number, the switching noise can be eliminated effectively. Under 
the experimental condition, the proposed method dramatically increases the ratio of 
signal intensity (the maximum integrated value of arc current before arc occurrence to 
the maximum integrated value of arc current after arcoccurrence) from 2.2 to 907. 
However, the proposed algorithm is not validated under other normal transient 
conditions such as the sudden drop of current due to partial shading, inverter start-up, 
etc. It requires further study to validate its effectiveness to withstand nuisance tripping. 
The predefined threshold value is a limitation as it can be different for different 
systems. In [74], a series arc fault detection method based on relative magnitude 
comparison has been proposed, and adaptive threshold values for frequency domain are 
calculated statistically. The sampling frequency is 250 kHz, the length of the window is 
 
29 
0.002 s, and the feature frequency band is 10-50 kHz. After using time domain analysis 
to lock the potential arc instant point in the loop current (the candidate point), the 
threshold value for each frequency component is determined by calculating the mean 
and standard deviation of each frequency content in several consecutive windows 
before the candidate points. When most of the magnitudes of frequency contents exceed 
the threshold value for a certain number of consecutive periods, a series arc fault event 
can be confirmed. This method can effectively differentiate arc state from many normal 
operations such as inverter disconnected and load switching in DC microgrid, and the 
detection speed is quite high (less than 16 ms). Most importantly, the dynamic threshold 
value adjustment makes the algorithm flexible for different systems and operating 
environment. However, this algorithm may not work properly in the scenarios that 
inverter or converter start-up (this is often the case in PV systems). In this case, there 
will be a steep change in the time domain and the magnitude of high-frequency contents 
will increase in the frequency domain just like an arc fault, which may cause unwanted 
trips. 
Miao et al. proposed DC series arc fault detection in PV systems using pink noise 
characteristics of the loop current with a sampling frequency of 50 kHz [75], [76]. 
Besides the algorithms they proposed, another important contribution of their work is 
the use of tunnel magnetoresistance sensor to measure the PV loop current. Based on 
their study, tunnel magnetoresistance sensor has smaller size, relatively high bandwidth 
(~1 MHz), and relatively low cost (few USDs) compared to other types of current 
sensors such as current probe [52], current transformer [77], [78], or magnetic sensors 
such as Hall-effect sensors [79] used in some existing literatures. 
A summary of FFT based detection methods is presented in Table 2.2. 
 
 
30 
Table 2.2 Summary of FFT based detection methods 
Ref. 
Verified by 
experiment? 
Microcontroller or 
product 
Sampling 
Frequency 
Test accuracy Detection time 
[61] Yes RD-195 250 kHz Not mentionedrate (13.42%) with 0% rejection rate. The three 
mentioned features would form a new mixed criterion and it significantly decreases 
both malfunction rate (0%) and rejection rate (0.875%). The drawback of the proposed 
method is the high sampling rate. However, mixing criteria greatly makes up the 
inadequacies of the single criterion method. 
In [83], WPD with coif4 has been applied to extract the energy of different sub-
bands under the sampling frequency of 100 kHz. After 6-level decomposition, both 
series and parallel arc faults can be detected once the ratio of the sum of the square of 
reconstruction coefficients of the high-frequency band (781.25 Hz-50 kHz) to that of 
the lowest frequency band (0-781.25 Hz) exceeds the predefined threshold value for 
several consecutive analysis periods. 
In [52] and [85], 2-level WPD with db8 wavelet has been applied for 
normalisation purpose under the sampling frequency of 200 kHz. The RMS value of 
coefficients in the frequency band 0-25 kHz (including DC offset) is normalised by that 
of in 25-50 kHz in a certain length of the time window (i.e. 10 ms). The peak RMS 
value goes up from 6% to 15% after normalisation, which means the signature induced 
by the series arc fault has been enlarged. 
The performance of such algorithms significantly relies on the mother wavelet. Db 
wavelets are proven to be suitable for arc fault diagnosis in resistive systems [86] but 
exhibit detection limitations when applied to inverter-based systems. To address this 
 
34 
issue, Chen et al. choose rbio3.1-based mother wavelet to extract more distinguishable 
arc features in various grid-tied PV systems [91]. 
The main difference between DWT and WPD is shown in Figure 2.8. Besides 
decomposing the lower frequency band at each level, WPD can also decompose the 
higher frequency band of the signal. Thus, WPD offers more information than DWT at 
the expense of almost double computation burden. 
A summary of DWT and WPD based detection methods is presented in Table 2.4. 
 
 
Original signal
A1
h(n) 2
g(n) 2
H: Low-pass filtering and decimation
G: High-pass filtering and decimation
D1
A2 D2
A3 D3
Original signal
A1 D1
AA2 DA2 AD2 DD2
AAA3 AAD3 ADA3 DDA3 AAD3 DAD3 ADD3 DDD3
H G
H G
H G H G H G H G H G
H G H G
H G
 
Discrete Wavelet Transform Wavelet Packet Decomposition
π/8 π/4 3π/8 π/2 5π/8 3π/4 7π/8 π
Frequency 
D1D2D3A3
π/8 π/4 3π/8 π/2 5π/8 3π/4 7π/8 π
Frequency 
DDD3DDA3AAD3
 
Figure 2.8 Comparison of DWT and WPD analysis (3-level as an example) 
 
 
35 
Table 2.4 Summary of wavelet transform based detection methods 
Ref. Technique 
Verified by 
experiment? 
Microcontroller 
or product 
Sampling 
Frequency 
Test accuracy 
Detection 
time 
[26] DWT Yes Not implemented 1 MHz Not mentioned 
Not 
mentioned 
[88] DWT Yes Not implemented 200 kHz Not mentioned 
Not 
mentioned 
[89] DWT Yes RD-195 200 kHz Not mentioned 0.1 s 
[52] 
[85] 
WPD and 
Statistics 
Yes TMS320F28335 200 kHz 
100% at voltage 
below 300V and 
current below 25A; 
60% at 240V/25A; 
40% at 300V/25A 
~0.1 s 
 
2.4.5. Statistical Analysis 
There are various statistical features that can be used for DC arc faults detection, 
such as mean, standard deviation, RMS value, entropy, and the extreme value of the 
input signal. 
The methods based on time domain analysis can be found in [33], [52], [58], [92]–
[98]. In [92], the change rate of the loop current in the time domain is proposed as an 
indicator to determine the arc fault event. However, it is easily affected by random spike 
disturbance. In [52] and [93], the difference between the maximum and minimum 
current values over a certain length of the time window is defined as indicator. This 
method is simple but quite effective to recognize arc fault, especially in the initial stage. 
Although it can substantially eliminate the random disturbance from the noise, its 
performance may be affected by other factors such as MPPT operation from inverter 
when the irradiation level changes quickly. In [94] and [95], an outlier analysis based 
detection method achieves 98% PV series arc fault detection rate at a false alarm rate of 
 
36 
0.01% in a single PV module through the simulation. Minimum covariance determinant 
estimator is used to optimise the performance of the algorithm. In this outlier analysis, 
the operating voltage and current of different PV modules at the same time instant are 
fed into the minimum covariance determinant estimator. Then, the distance in I-V 
characteristic curve between each PV module and the centre of the PV module’s 
distribution is calculated by the estimator and used for detection. 
In [96], a finite impulse response estimator is used to calculate the variance of the 
input voltage signal under the sampling frequency of 50 kHz. The input signal is first 
passed through a band-pass filter with cut-off frequency of 1 kHz and 7.5 kHz, and then 
fed into the estimator, and the estimator then compares the current value with the 
previous value. When the estimation is perfect, the variance is supposed to be 0. Then, 
once the variance exceeds a pre-defined threshold value, an arc fault event can be 
detected. The proposed algorithm is easier to be implemented and much cheaper, but at 
the expense of lower accuracy compared to others. 
In [97], multiple detection criteria are used for jointly detecting arc faults in PV 
systems. In the time domain, statistical features including the mean and variance of the 
loop current are calculated. In the frequency domain, the ratio of frequency contents of 
the loop current in 1 Hz - 4 kHz to DC components and to AC components is calculated. 
When at least one of time domain features and frequency domain feature exceed the 
pre-defined threshold value respectively, an arc event can be determined. It should be 
highlighted that the multiple detection criteria increase the accuracy of the algorithm 
significantly. 
Recently, a statistical detection approach based on arc current entropy has been 
introduced in [58]. This method can effectively differentiate arc faults from the normal 
events (non-arc states) such as MPPT operation and switch-on of the inverter. It 
 
37 
calculates the modified Tsallis entropy of loop current twice. Tsallis entropy can reveal 
the degree of disorder and signal intrinsic behaviour: 
𝑀 = ∑ 𝑝(𝑥𝑘)𝑞
𝐾
𝑘=1
= 1 − (𝑞 − 1)𝐸𝑇𝑠𝑎𝑙𝑙𝑖𝑠,𝑀 (2.15) 
𝑝(𝑥 = 𝑥𝑘|𝑡 = 𝑘𝑇) = 𝑝(𝑥𝑘) ≝
∥ 𝑥𝑘 ∥2
∑ ∥ 𝑥𝑘 ∥2𝐾
𝑖=1
 (2.16) 
𝑀′ = ∑ 𝑝(𝑥𝑘,𝑀)
𝑞′
𝐾′
𝑘=1
= 1 − (𝑞′ − 1)𝐸𝑇𝑠𝑎𝑙𝑙𝑖𝑠,𝑀′ (2.17) 
𝑝(𝑥 = 𝑥𝑘,𝑀|𝑡 = 𝑘𝑇) = 𝑝(𝑥𝑘,𝑀) ≝
∥ 𝑥𝑘,𝑀 ∥2
∑ ∥ 𝑥𝑘,𝑀 ∥2𝐾′
𝑖=1
 (2.18) 
where 𝑀 is the modified Tsallis entropy, 𝑞 and 𝑞′ > 0, 𝑥𝑘 denotes the samples of the 
signal of interest, 𝐾 is the sliding window size in the first calculation, 𝐾’ is the sliding 
window size in the second calculation, and in each sliding window both the sum of 
𝑝(𝑥𝑘) and 𝑝(𝑥𝑘,𝑀) is 1. In the first stage of modified Tsallis entropy evaluation (𝑀) 
where the captured current is the input signal, MPPT algorithm (due to fast-moving and 
mechanical vibration induced by wind) may introduce variance in the value of 𝑀 with 
certain patterns. Then, with proper value of 𝐾 and 𝐾’ , those disturbances can be 
eliminated in the second stage of modified Tsallis entropy evaluation (𝑀′) where 𝑀 is 
the input signal. The 𝑀′ will pass a first order infinite impulse response filter with 16 
Hz cutoff frequency to remove the DC offset of 𝑀′ to get the detection feature 𝑀𝑧𝑜. 
Then, the threshold value can be calculated based on standard deviation of 𝑀𝑧𝑜 and it 
will be updated every 0.5 seconds to keep up withthe changing operating condition of 
the system. The sampling frequency is only 10 kHz, and the computation load is just 
24𝑁 flops compared to 5𝑁𝑙𝑜𝑔2𝑁 flops of FFT per sliding window, which is very cost-
effective. However, this method will be less effective in a noisier environment. 
 
38 
In [98], besides using the line current, Lu et al. also extracted useful information 
from the PV supply voltage. Based on extensive experimental study on DC series arc 
faults under different conditions in an experimental PV system, the change rate of the 
line current, the change rate of the average line current, and the standard deviation of 
the line current and AC components of the supply voltage are selected for detection. 
However, there are more threshold parameters to set. It would take more efforts to fine-
tune those threshold parameters in a different PV system. Furthermore, although the 
proposed algorithm has relatively low calculation complexity, it requires an additional 
sensor (a current sensor and a voltage sensor) as compared to other methods (typically 
only require a current sensor or a voltage sensor). The case study is carried out under 
200 kHz sampling frequency. It is worthwhile to investigate the feasibility and 
effectiveness of the proposed method with lower sampling frequency (e.g. 20 kHz). 
Overall, the statistics-based fault diagnosis methods generally require less sampling 
frequency and computation effort, but their performance will be severely affected by the 
noise level of the surrounding environment. A summary of statistical analysis-based 
detection methods is presented in Table 2.5. 
Table 2.5 Summary of statistical analysis based detection methods 
Ref. 
Verified by 
experiment? 
Microcontroller or 
product 
Sampling 
Frequency 
Test accuracy Detection time 
[58] Yes TMS320F28335 10 kHz 
Pass all test 
cases under 
experimental 
condition 
~0.511 s 
[94] Yes Not mentioned Not mentioned 98% Not mentioned 
[95] Yes Not mentioned Not mentioned 98% Not mentioned 
[96] Yes TMS320F2808 50 kHz Not mentionedof 99.4%. 
SVM is believed to be better than ANN because of the following reasons. Firstly, 
SVM does not suffer from the over-fitting problem in the training process. Secondly, 
SVM can always find the global minimum during the training process, while ANN may 
converge on local minimum; in other words, ANN often provides locally optimal 
solutions and loses the big picture. Finally, SVM could get high accuracy by using 
fewer data sets than ANN. The performance of SVM dramatically depends on the 
quality of the training data. Both ANN and SVM are heuristic techniques, and thus their 
reliability is difficult to be proved. 
FL can also be applied for DC arc fault detection. In [112], Grichting et al. 
proposed a PV series arc fault detection method using two-level FL and electrical 
parameters calculated from the loop current and input voltage of the inverter. The rules 
in the fuzzy system are designed according to fault pattern and mechanism. The input 
signal will be firstly fuzzified as input of the fuzzy system, and then the arc faults and 
normal operation can be classified with predefined rules. The performance of FL-based 
methods highly depends on the expert knowledge. Incomplete or incorrect knowledge 
could severely affect the detection accuracy. However, FL-based methods provide 
economical solutions since they require relatively low computation efforts for real-time 
implementation. 
Other ML techniques, such as decision tree (DT) learning and learning classifier 
system, have been used for fault detection (not including arc faults) and classification in 
PV systems a few years ago [113], [114]. In [113], a semi-supervised graph-based 
model has been applied to detect and classify line-to-line fault, open circuit fault, and 
normal condition. In [114], a supervised DT-based model has been used for similar 
function. Most recently, in [91], Chen et al. proposed a random forest (RF) based 
 
43 
protection strategy using rbio3.1-based DWT features for series arc fault detection in 
PV systems. RF demonstrates better performance when the input size is greater as 
compared to SVM. With the help of more generalised features extract by rbio3.1-based 
wavelet compared to the features extract by db-based wavelet, the RF classifier can 
achieve 90% accuracy level in different grid-connected PV systems even without 
adjusting its parameters. The Hidden Markov model (HMM), which is superb to the 
application related to non-stationary and highly transient signals, has been applied to 
detect DC series arc faults [103]. A DC network model combined with series arc model 
in [53] has been established in Matlab/Simulink with a dataset of different conditions 
(series arc fault condition, nominal steady-state condition, and nominal transient 
condition). The DWT (the mother wavelet is db2) level 1-3 approximation coefficient 
and level 1-2 detail coefficient, and moving average of the loop current of different 
conditions in a 50-ms window (6 features in total) are chosen as the features for series 
arc fault detection under the sampling frequency of 20 kHz. Those features are fed into 
the HMM for training purpose. The HMM will output the log-likelihood metric to 
quantify the probability of the presence of an arc. With proper selection of the threshold 
values for log-likelihood, a series arc fault event can be detected and discriminated from 
other conditions. HMM is one of the probabilistic models, and there are several 
limitations to probabilistic model-based methods. Firstly, the accuracy highly relies on 
the quality of the training data, which can be collected from real systems or simulation. 
Capturing data in real world is very costly, and it is difficult to cover all the conditions, 
while data obtained from simulation highly depends on the accuracy of system 
modelling. Secondly, compared to other methods, the probabilistic models-based 
method needs more computation effort. For instance, although HMMs has the 
advantage of the minimal computation load for calculating the log-likelihood, its order 
 
44 
of computation complexity is still very high: 𝒪 (𝑁2) compared to 𝒪 (𝑁𝑙𝑜𝑔𝑁) of FFT, 
and 𝒪 (𝑁) of one dimensional db2 DWT. 
Instead of using only one ML classifier to make the decision, the ensemble ML 
combines multiple ML classifiers using three different ensemble learning techniques: 
bagging, boosting, and stacking as illustrated in Figure 2.9. 
 
ML Classifier 1
ML Classifier 2
ML Classifier n
 
 
Training dataset
Training dataset
Training dataset
Meta ML 
Classifier
Decision
ML Classifier 1
ML Classifier 2
ML Classifier n
 
 
Decision
Weight 1
Weight 2
Weight n
Ensembled
Model
ML Classifier 1
ML Classifier 2
ML Classifier n
 
 
Sub Training dataset 1
Model Average
Decision
Bagging
Boosting
Stacking
Sub Training dataset 2
Sub Training dataset n
Sub Training dataset 1
Sub Training dataset 2
Sub Training dataset n
 
 
(a)
(b)
(c)
 
Figure 2.9 Ensemble ML techniques: (a) Bagging; (b) Boosting; (c) Stacking 
 
 
45 
In [115], Le et al. comprehensively investigated various ensemble ML learning 
algorithms (using the load current as input signal) with different conventional ML 
learning methods, such as DT, RF, k-Nearest Neighbours (kNN), Gaussian Naïve Bayes 
(NB), and SVM. Based on extensive and rigorous analysis performed by the authors, an 
input vector is formed by five time-domain features extracted from the load current 
signals including the average, median, variance, RMS, and the difference between the 
maximum and minimum value. It is found that the stacking ensemble algorithm formed 
by kNN, RF, Gaussian NB, and logistic regression (LR) as a meta-classifier 
demonstrates the best performance. The same authors further investigated the 
effectiveness of semi-supervised ensemble ML learning (SVM or DT) under the 
condition when there are a significant number of unlabelled samples and limited 
labelled samples [116]. 
The advantages of ensemble ML are: 
• it usually can improve the performance over any single ML model; 
• it has less probability to overfit and is more stable. 
The drawbacks of ensemble ML are: 
• it does not perform well on simple dataset; 
• it is usually computationally expensive, and therefore, it adds additional training 
time and more memory constrains to the applications; 
• it reduces the model interpretability. 
A summary of ML based detection methods is presented in Table 2.7. 
 
 
 
46 
Table 2.7 Summary of ML detection methods 
Ref. Methodology 
Microcontroller 
or product 
Sampling 
Frequency 
Test accuracy 
Detection 
time 
[91] DWT + RF Not implemented 
Not 
reported 
> 90% 0.55 s 
[103] DWT + HMM Not implemented 20 kHz 
98.3% (simulation) 
100% (experiment) 
57.1 ms 
[104] FFT + BPNN Not implemented 5 MHzThe computed 
value represents the propagation of radio-frequency signal generated by arc, which will 
be different after arc occurrence. In this application, it requires a very high sampling 
frequency above 1 MHz. 
In [118], Ahmadi et al. proposed a hybrid method that combines the cross-
correlation and signal-to-noise ratio to detect series arc fault in PV systems (sampling 
frequency is 10 kHz). The measure DC terminal voltage 𝑉𝐷𝐶 is firstly normalised by the 
 
47 
open circuit voltage of the system 𝑉𝑂𝐶 to get the normalised DC terminal voltage 𝑉𝑎 =
𝑉𝐷𝐶/𝑉𝑂𝐶. After that, 𝑉𝑎 is filtered by a high-pass filter with cutoff frequency of 1 kHz to 
get the noise components 𝑉𝑏. Both signals are framed with a frame size of 𝐿. Then, the 
𝑛𝑡ℎ and 𝑛𝑡ℎ − 10 data-frame of 𝑉𝑏 are selected to calculate the cross-correlation 
between |𝑉𝑏
𝑛| and |𝑉𝑏
𝑛−10| in order to find out the index of the most similar parts of 
these two signals, 𝐿𝑎𝑔_𝑚𝑎𝑥. After that, 𝑉𝑏
𝑛 and 𝑉𝑏
𝑛−10 are resized using the following 
equation: 
{
𝑉𝑏
𝑛 = 𝑉𝑏(𝑛, |𝐿𝑎𝑔_ max |: 𝐿) 
𝑉𝑏
𝑛−10 = 𝑉𝑏(𝑛 − 10, 1: 𝐿 − |𝐿𝑎𝑔_ max |)
 if Lag_max ≥ 0 
{
𝑉𝑏
𝑛 = 𝑉𝑏(𝑛, 1: 𝐿 − |𝐿𝑎𝑔_ max |) 
𝑉𝑏
𝑛−10 = 𝑉𝑏(𝑛 − 10, |𝐿𝑎𝑔_ max |: 𝐿 )
 if Lag_max2.9 Comparison of detection methods for DC arc fault detection 
Detection 
method 
Domain 
Frequency 
resolution 
Time 
resolution 
Sampling 
Frequency 
Computation 
Effort 
Popularity 
Trend 
FFT Frequency High N/A Medium/High Medium Stable 
STFT Both Medium Medium Medium/High Medium/High Stable 
DWT Both Medium Medium Medium/High Low/Medium Higher 
WPD Both Medium Medium Medium/High Medium Stable 
Statistics-
based 
Both Depends Depends Depends Low Higher 
Shallow 
ANN 
Both Depends Depends Depends Medium/High Lower 
SVM Both Depends Depends Depends Medium/High Higher 
HMM Both Depends Depends Depends High Higher 
FL Both Depends Depends Depends Low/Medium Lower 
Model-
based 
Both Depends Depends Depends Medium/High Higher 
EMR Time N/A High High Low Higher 
SSTDR Time N/A High High Medium Stable 
Cross 
correlation 
Time N/A High High Medium Stable 
Kalman 
filter 
Time N/A Depends Low Low Higher 
 
52 
2.5. Discussion and Conclusion 
Among those methods mentioned in the previous section, most of the fault 
signatures are extracted from the 1-100 kHz frequency band, where it has the least 
impact on the arc fault diagnosis from the majority of disturbance except 
inverter/converter noise. Therefore, eliminating the influence of power electronics noise 
can significantly increase the reliability and accuracy of the detection algorithms. 
Recently, many detection methods with good switching noise immunity have been 
introduced. For example, inverter noise is eliminated by calculating the Tsallis entropy 
twice [58] or by the cross-correlation function [118], switching noise is suppressed by 
performing a denoising algorithm [72], and normal transient events can be classified by 
using intelligent detection systems [103]. More noise suppression techniques should be 
developed and incorporated into detection algorithms to reduce the impact from 
different noise sources to avoid unwanted tripping [37]. Up to now, common fault 
signatures are still yet to be discovered, and more thorough DC arc fault characteristic 
studies and development of good feature extraction techniques should be carried out. 
Threshold comparison is the main approach to perform fault diagnosis, and the 
threshold values are critical to most of the detection methods. The fixed threshold 
values are often the main limitation for most of the methods because the behaviours of 
arc faults may vary with different system structures and environment (i.e. different 
background noise levels). Recently, adaptive threshold values have been used to 
mitigate this problem, and all the threshold values are determined statistically [39], [58], 
[74], [81]. Furthermore, most of the traditional algorithms rely on a single detection 
criterion, which can be easily influenced by various disturbances. Therefore, many 
recently proposed algorithms use several criteria extracted from hybrid domains or 
multiple signals (e.g. source voltage and load current) to improve arc fault diagnosis 
 
53 
performance [39], [52], [74], [81], [90], [97], [98], [121]. 
With the advancement of computation technology and recent success progress, 
ML-based techniques become increasingly popular in many fields including arc fault 
diagnosis and prognosis [91], [103]–[116]. Based on the comprehensive reviews 
presented above, DL has not yet been applied to this field, which remains a research gap 
to be filled. One of prominent challenges for most of these techniques is shortage of 
training dataset; this problem can be potentially mitigated by accurate system and arc 
fault modelling [99], [100], [103], or data augmentation. However, existing literatures 
mainly focus on simulation of the static V-I characteristics of arc faults. Although 
several high-frequency arc fault models have been proposed recently, they can only 
demonstrate similar patterns of real arc faults in the frequency domain. Therefore, more 
precise high-frequency arc fault models or more advanced data augmentation 
techniques need to be developed. There are other challenges when applying ML for 
practical applications, and they will be discussed in detail in Section 4.6. 
Besides the detection algorithms using electrical signals, high-frequency EMR 
signals are also considered for DC arc fault detection in PV systems [124], [127], [128]. 
As the detection range is usually limited, this type of method might be a good candidate 
for small household PV systems. 
In conclusion, this chapter has comprehensively reviewed DC arc fault modelling 
methods and the detection methods developed recently that can be used for PV systems. 
The advantages and disadvantages of different detection methods have been discussed 
and compared in detail. Better arc fault detection methods for PV systems with good 
self-adjustability, robustness, and cost-effectiveness are still yet to be realised. More 
efforts are still needed to improve the detection accuracy and reliability. 
 
54 
Some of the work described in this chapter has been published in: 
Shibo Lu, B. T. Phung and Daming Zhang, “A Comprehensive Review on DC Arc 
Faults and Their Diagnosis Methods in Photovoltaic Systems,” Renewable and 
Sustainable Energy Reviews, vol. 89, pp.88-98, June 2018. 
 
 
55 
3. Characteristics Study on DC Series Arc Fault 
3.1. Introduction 
Voltage-current characteristics of arcing fault have been studied over past few 
decades, and many empirical equations have been proposed to describe the static V-I 
characteristics [43]. However, behaviours of the high frequency components of the arc 
current and their dependency on many other factors are often ignored. Yao et al. 
proposed a DWT based algorithm and it achieves 100% accuracy at lower current and 
voltage level, while the accuracy decreases significantly at higher current and voltage 
level [52]. In many applications the sum of power spectrum components within the 40-
100 kHz frequency band has been used as an indicator, which increases after arc fault 
occurrence [61], [68], [69]. However, the algorithm may fail at higher current level: it 
has been shown that, under fixed source voltage and gap distance, the probability 
distribution of arcing signal (power density spectrum signal strength vs. probability) can 
extend into the area of normal signal at higher load current level [129]. This can affect 
the detection accuracy and reliability of the detection algorithm. Furthermore, Zhen et 
al. found that fault indicator with satisfactory performance for 28V DC systems may 
fail to work for 270V DC systems under the same load current [33]. Therefore, high 
frequency noise characteristics and their determinants play an important role in the 
development of arc fault detection algorithms. All these factors call for a more in-depth 
understanding of DC arc fault phenomenon to achieve better detection. As arc fault 
shows different behaviour at different working condition, new methods are also needed 
to extract the consistent features. 
The high-frequency variation is related to cathode spot activities [130]. The reason 
is cathode spot motion can cause localised breakdown in the plasma sheath, which can 
 
56 
generate broadband radio frequency signals. Focusing on the radio-frequency current 
with frequency up to tens of MHz requires high-performance sensors and measurement 
equipment to precisely capture the high frequency components. It also requires a more 
powerful and costly microcontroller to process large amount of data under the higher 
sampling rate. Furthermore, the behaviours of radio-frequency current, especially above 
100 kHz, can be affected by various factors in PV systems [131]. Therefore, this thesis 
focuses on the frequency contents below 100 kHz. 
3.2. Experimental Setup 
The power source used is LAB/HP 15600 and it can supply DC voltage and current 
up to 500 volts and 30 amps, respectively. An adjustable resistiveload bank with up to 
85.5 Ω is used. Since this chapter mainly focuses on exploring arc features, power 
electronics load is not considered. Additional inductive components and capacitive 
components are not included in the experiment circuit in this session. 
The DC information of the loop current and arc voltage is captured by PROSyS-
CP35 differential current probe and SI-9000 differential voltage probe, respectively. In 
addition, the high frequency variation of the loop current is captured by a Pearson-4688 
current transformer with 600 Hz cutoff frequency (-3dB attenuation). The data 
acquisition (DAQ) system consists of a National Instruments PXIe-1073 and a PXIe-
4300. The PXIe-4300 can stream the raw data to a personal computer at sampling 
frequency of 200 kHz with 16-bit analog-to-digital converters. The experimental setup 
for this chapter is shown in Figure 3.1. 
 
57 
Resistor 
load bank
PC
NI PXIe 
4300 DAQ
Arc 
generator
DC Power 
source
Voltage 
sensor
Current 
sensors
AC V
AA
CT Probe+
-
PC
DAQ
Arc generator
DC power supply
Load
1
 
Figure 3.1 Experimental setup for characteristics study of DC series arc fault 
The DC arc fault generation method has been introduced in PV DC arc fault circuit 
protection outline, UL-1699B, in 2011 [10], [14]. It uses “steel wool” method to initiate 
an arc fault: the moving electrode is firstly inserted into a polymer sheath tube 
containing some steel wool, and then adjusted to a given distance from a stationary 
electrode. When voltage is applied across the electrodes, the steel wools burn and melt 
quickly, which ignite an arc in the electrode gap. However, arc generation with the 
“steel wool” method takes more time and brings more complexity with less repeatability. 
More importantly, it is difficult to facilitate low power arc fault such as 100 W arc 
 
58 
faults [34]. Furthermore, as the arc length and arc gap of parallel arc in the real world is 
much longer than series arc, the steel-wool method is a better way to initiate and create 
sustainable parallel arc. The higher power and longer air gap of parallel arc easily 
destroys the steel wool and thus it would vaporise quickly for establishment of arc. The 
“pull-apart” method is better for creating a series arc, whereas the “steel wool” method 
with tube is easier to develop a parallel arc. 
The first edition of the formal UL-1699B Standard became available from August 
2018 [15]. Steel wools and polymer sheath tube are removed from the guideline of arc 
fault detection tests. Therefore, as recommended in [14], [15], an arc generator without 
steel wools and polymer sheath tube is used in this chapter to reduce the complexity and 
increase the repeatability of the arc generation experiment. The diagram of the arc 
generator is illustrated in Figure 3.2. Copper is the material used for electrodes in the 
test as it is commonly used for connecting in DC systems. Copper electrodes of 6.35 
mm (1/4 inch) diameter with flat tip are used. The experimental conditions are listed in 
Table 3.1. 
 
Figure 3.2 Diagram of arc fault generator in UL-1699B Standard 
The same data collection procedures are followed for all experimental conditions: 
 
59 
1. Wait until electrodes and load resistor cool down to the ambient temperature; 
2. Polish electrodes with sandpaper, mount electrodes onto the arc generator, and 
connect them together; 
3. Adjust the load resistance; 
4. Start the DC power source and adjust the voltage and current to the desired level; 
5. Start the data acquisition (DAQ) system in the personal computer; 
6. Initiate an arc by separating the copper electrodes to the pre-defined gap distance; 
7. Shut down the DC power source to extinguish the arc and stop the measurement. 
All the datasets are saved locally. Then, the on-site data sets are visualised, 
analysed, and processed by Matlab. 
Table 3.1 Experimental conditions for characteristics study of DC series arc fault 
Case Condition (air gap width = 1, 2, 3 mm) 
Fixed DC voltage (200V) 4.1, 7.9, 11, 13.6 A 
Fixed DC current (6.5A) 87, 111, 134, 158 V 
 
3.3. Static Characteristics 
3.3.1. V-I Characteristics 
Arcing is a very complex physical phenomenon. Nowadays, most arc studies are 
based on observation of experiments and analysis of acquired data, and scholars mainly 
use the V-I curve to characterise this phenomenon [43], [131], [132]. The arc can be 
then treated as a non-linear resistance. The experimental data is fitted to the Nottingham 
arc model, which is suitable for gap distance from 1-10 mm, as shown in (3.1): 
𝑉𝑎𝑟𝑐 = 𝐴 +
𝐵
𝐼𝑎𝑟𝑐
𝑛 (3.1) 
 
60 
where A, B, and n are arc constants depending on the gap distance. The result is plotted 
in Figure 3.3, and it is found consistent with the results in [43], [132]. 
 
Figure 3.3 V-I characteristic under different gap distance 
It should be noted that the arc gap distance is not equal to the actual arc length, and 
the additional impedance injected into the circuit is contributed by the arc length; 
however, it could be considered they are the same when the air gap distance is small, 
which is the case in this research. When the gap distance is fixed, in the lower arc 
current region, the smaller the arc current, the larger the arc voltage, where the arc 
power (𝑃𝑎𝑟𝑐 = 𝑉𝑎𝑟𝑐𝐼𝑎𝑟𝑐) tends to remain the same; whilst in the higher arc current region, 
with the increasing arc current, the arc voltage remains approximately unchanged. 
When the arc current is fixed, the larger the gap distance is, the larger the arc voltage is, 
and thus the larger the arc resistance is. If assuming the electrical conductivity and the 
effective cross-section area of the arc column remain the same in the quasi-stationary 
state, the arc resistance is proportional to distance according to (3.2): 
 
61 
𝑅𝑎𝑟𝑐 =
1
𝜎
𝑙
𝑆
 (3.2) 
where σ is electrical conductivity of the arc column, l is arc length (it equals gap 
distance for short arc), and S is the effective cross-section area of the arc column. 
3.3.2. Stable Operating Point 
In a typical DC circuit, the following equation can be obtained: 
𝑉𝑠𝑜𝑢𝑟𝑐𝑒 = 𝐿𝑙𝑜𝑎𝑑
𝑑𝐼
𝑑𝑡
+ 𝑅𝑙𝑜𝑎𝑑𝐼 + 𝑉𝑎𝑟𝑐 (3.3) 
where 𝑉𝑠𝑜𝑢𝑟𝑐𝑒 is DC source voltage, 𝑅𝑙𝑜𝑎𝑑 and 𝐿𝑙𝑜𝑎𝑑 are circuit parameters, and 𝑉𝑎𝑟𝑐 is 
the arc voltage, which equals to 0 before arc fault instant. (3.3) can be rewritten as: 
𝑑𝐼
𝑑𝑡
=
1
𝐿𝑙𝑜𝑎𝑑
[(𝑉𝑠𝑜𝑢𝑟𝑐𝑒 − 𝑅𝑙𝑜𝑎𝑑𝐼) − 𝑉𝑎𝑟𝑐] (3.4) 
When an arc fault occurs, dI/dt is negative, and then the current starts to decrease. 
When the gap distance reaches the final value, a stable arcing is formed where 𝑉𝑠𝑜𝑢𝑟𝑐𝑒 −
𝑅𝑙𝑜𝑎𝑑𝐼 = 𝑉𝑎𝑟𝑐 and dI/dt = 0 as shown in Figure 3.4. The stable operating point is point 
A instead of point B. For example, consider the case when the current experiences a 
small disturbance at point A, which causes a decrease. Because dI/dt > 0 and where 
𝑉𝑠𝑜𝑢𝑟𝑐𝑒 − 𝑅𝑙𝑜𝑎𝑑𝐼 > 𝑉𝑎𝑟𝑐, the input power is greater than the power dissipated by arc. 
Therefore, the arc tends to burn more steadily, and the loop current increases back to 
point A. Similarly, when there is a small disturbance causing an increase in the current 
at point A, dI/dt 0
dI/dt > 0
Vsource
Varc2
Varc1
Vsource-IRload
A
B
 
Figure 3.4 The condition for a stable arcing point 
The typical waveforms of a free burning DC series arc fault in DC networks are 
shown in Figure 3.5. The arcvoltage remains approximately the same and then keeps 
decreasing after a few seconds. This is because the electrical conductivity of the electric 
arc increases with temperature once the temperature is high enough to cause thermal 
ionisation. Specifically, due to increasing temperature, the degree of dissociation and 
thermal ionisation of the inter-electrode medium and metal vapor keep increasing, 
leading to increasing number of charge carriers and electrical conductivity. Therefore, it 
effectively decreases the arc resistance and the arc voltage [133]. This suggests that 
under the same load condition (same source voltage and load current) and gap distance, 
the quasi-stationary V-I characteristic of DC arc may vary with temperature as the arc 
keeps burning continuously. Therefore, the stable operating point of the arc would 
change when the temperature is high enough to cause thermal ionisation. 
 
63 
(a)
(b) 
Figure 3.5 Waveforms of a typical DC series arc fault: (a) arc current; (b) arc voltage 
3.3.3. Load Current Effect and Source Voltage Effect 
Because arc is a very complex physical phenomenon, it is difficult to analyse its 
behaviour point by point because it cannot be exactly treated as a constant resistor. 
Therefore, the average arc resistance is calculated by averaging the arc resistance of 
each sampled data point in the first 2 seconds duration after the moving electrode has 
reached the predefined distance point. In Figure 3.6 (a), the average arc resistance and 
the amount of change in average resistance decrease as the load current level increases 
when the source voltage is fixed. On the other hand, it is fair to say that the DC voltage 
level has fewer impacts on the arc resistance compared to the load current level as 
shown in Figure 3.6 (b). The average arc resistance decreases slightly with increasing 
source voltage level as expected. Note that this condition is only satisfied when 
generating sustainable arcing. Also, arc resistance increases with increasing gap 
 
64 
distance, which agrees with (3.2). 
(b)
(a)
 
Figure 3.6 Average arc resistance under (a) fixed source voltage; (b) fixed load current 
3.4. High-frequency Variation in Arc Current 
In this part, Fourier transform and wavelet packet entropy are applied to analyse arc 
fault current signals under different conditions. 
3.4.1. Wavelet Packet Entropy 
WPD is a multi-resolution analysis that can divide a signal into different frequency 
 
65 
bands as illustrated in Figure 2.8. Mathematically, discrete WPD can be achieved 
through a series of convolutions with a pair of high-pass and low-pass filters, 𝑔(𝑛) and 
ℎ(𝑛) [134]. The representations of these filters are shown as follows: 
𝑔(𝑛) =
1
√2
〈𝜓(𝑡), 𝜑(2𝑡 − 𝑛)〉 (3.5) 
ℎ(𝑛) =
1
√2
〈𝜑(𝑡), 𝜑(2𝑡 − 𝑛)〉 (3.6) 
𝑔(𝑛) = (−1)𝑛ℎ(1 − 𝑛) (3.7) 
where 〈∙,∙〉 is the inner produce operator, 𝑡 and 𝑛 are variables, and 𝜑(∙) and 𝜓(∙) 
denotes the scale function and its corresponding wavelet function, respectively. 
For a discrete time-series signal, the wavelet coefficients for different frequency 
bands at different decomposition levels can be calculated iteratively using the following 
functions: 
𝑊2𝑚−1
𝑗+1
(𝑘) = ∑ ℎ(𝑛 − 2𝑘)𝑊𝑚
𝑗
(𝑛)
𝑘
 (3.8) 
𝑊2𝑚
𝑗+1
(𝑘) = ∑ 𝑔(𝑛 − 2𝑘)𝑊𝑚
𝑗
(𝑛)
𝑘
 (3.9) 
where 𝑊1
0 denotes the original discrete time-series signal with length of L, 𝑊𝑚
𝑗
(𝑛) 
represents the wavelet coefficients at 𝑗𝑡ℎ level and 𝑚𝑡ℎ frequency band, 𝑛 =
1, 2, … , 𝐿 2𝑗⁄ , 𝑊2𝑚−1
𝑗+1
(𝑘) is the wavelet coefficients at (𝑗 + 1)𝑡ℎ level and (2𝑚 − 1)𝑡ℎ 
frequency band, 𝑊2𝑚
𝑗+1
(𝑘) is the wavelet coefficients at (𝑗 + 1)𝑡ℎ level and 2𝑚𝑡ℎ 
frequency band, and 𝑘 = 1, 2, … , 𝐿 2𝑗+1⁄ . It should also be noted that 𝑚 = 1, 2, … , 2𝑗 
at 𝑗𝑡ℎ level. 
Entropy can be used to measure the degree of disorder. Then, the wavelet packet 
 
66 
entropy is achieved by combining WPD and entropy theory together. Specifically, after 
WPD analysis, the set of wavelet coefficients at 𝑗𝑡ℎ level and 𝑚𝑡ℎ frequency band, 𝑊𝑚
𝑗
, 
with values of {𝑤𝑚1
𝑗
, 𝑤𝑚2
𝑗
, ⋯ , 𝑤𝑚𝑛,
𝑗
⋯ , 𝑤𝑚𝑁,
𝑗
} can be obtained, where 𝑚 = 1, 2, ⋯ , 2𝑗 
and, 𝑁 denotes the number of coefficients. Then, the entropy 𝐻𝑚
𝑗
 can be calculated as 
follows: 
𝐻𝑚
𝑗
= − ∑ 𝑝(𝑤𝑚𝑛
𝑗
)𝑙𝑜𝑔𝑝(𝑤𝑚𝑛
𝑗
)
𝑁
𝑛=1
 (3.10) 
𝑝(𝑤𝑚𝑛
𝑗
) =
|𝑤𝑚𝑛
𝑗
|2
∑ |𝑤𝑚𝑛
𝑗
|2𝑁
𝑛=1
 (3.11) 
where 𝑝(𝑤𝑚𝑛
𝑗
) is the probability of 𝑤𝑚𝑛
𝑗
, and ∑ 𝑝(𝑤𝑚𝑛
𝑗
) = 1𝑁
𝑛=1 . 
In the case study of this chapter, the sampling frequency is 200 kHz, the WPD level 
is 𝑗 = 3, and the analysed window length is 𝑇 = 40 ms. Accordingly, there are 8 sub-
bands with 𝑁 = 1000 wavelet coefficients for each band. Furthermore, db9 is chosen as 
the mother wavelet because of its excellent performance in arc fault signal analysis in 
DC systems with resistive loads [26]. 
3.4.2. Effect of Arc Phase 
Based on experimental observations, after initiation of arc fault, the arc behaviour 
can be characterised as three different arc phases under the experimental conditions. 
The waveforms of DC series arc fault generated at 11A/200V with a gap distance of 1 
mm is used as an example as shown in Figure 3.7: 
• Arc phase 1: Drastic variation can be observed in arc current signal. 
• Arc phase 2: Less variation can be observed in arc current signal. Also, some 
 
67 
spikes can be observed. 
• Arc phase 3: The arc tends to be steady. There are less variation and spikes 
observed in arc current signal. 
Drastic variation Less variation, 
some spikes
Less variation, 
less spikes
Arc inception Arc quenching
Power off
 
Figure 3.7 High frequency variation of DC series arc fault at 11A/200V 
This phenomenon is concluded based on experimental observations, and similar 
phenomenon was observed in [77]. More comprehensive analysis related to its physics 
is not carried out since it is not within the scope of this thesis. 
Because of natural convection caused by air heated by ionisation and upward 
convective flow caused by lower density of hot air, the convection force makes arc fault 
have a bow shape [43]. In addition, the arc root tends to be stable at the edge of the 
electrodes. The frequency spectrum is shown in Figure 3.8. The arc noise intensity in 
the current spectrum increases significantly after its occurrence. Furthermore, as 
described above, the spectrum level decreases when the phase shifts with time. The arc 
current spectrum also exhibits pink noise characteristics. 
 
68 
 
Figure 3.8 Average frequency spectrum of the 2-second data at non-arc state and arc 
state at difference arc phase (FFT analysis window is 0.2 seconds) 
 
The wavelet packet entropy calculation is shown in Figure 3.9. Right after the 
instant of arc initiation, the entropy for each band decreases significantly towards zero. 
This is mainly caused by the large spike introduced in arc current because of the arc 
ignition. After the initiation, the entropy level of bands 2-4 and 6-8 shows some 
increase with little fluctuations, while that of band 1 and band 5 remains approximately 
the same but with larger fluctuations. The entropy level for each band at non-arc state 
and different arc phases are shown in Table 3.2. Unlike the frequency spectrum, which 
changes significantly with different arc phases, the entropy of bands 2-4 and 6-8 
remains approximately unchanged. 
 
69 
 
Figure 3.9 Wavelet-packet entropy of DC series arc fault at 11A/200V 
Table 3.2 Wavelet-packet entropy level for different arc phases (11A/200V) 
 Non-arc Phase1 Phase2 Phase3 
Band8 (87.5-100 kHz) 5.805 8.814 8.780 8.749 
Band7 (75-87.5 kHz) 6.329 8.824 8.843 8.840 
Band6 (62.5-75 kHz) 5.896 8.620 8.644 8.622 
Band5 (50-62.5 kHz) 7.476 7.097 8.203 8.359 
Band4 (37.5-50 kHz) 6.069 8.797 8.807 8.834 
Band3 (25-37.5 kHz) 6.120I also extend my sincere gratitude to my secondary supervisor, Dr. Daming Zhang, 
for providing valuable feedback and timely discussions to my research work. 
In addition, I would like to acknowledge Mr. Zhenyu Liu, technical support staff, 
for his detailed instructions and fruitful discussions of the experiments throughout my 
research study. Besides, I would like to thank all my friends and colleagues in my life, 
in particular Tharmakulasingam Sirojan, Hua Chai, Miao Li, Dr. Muhammad Tariq 
Nazir all from the UNSW Energy Systems research group, and Rui Ma from University 
of Miami, USA. Thank you all for the friendship, help, and support. 
I would also like to thank the Tyree Foundation and UNSW for their financial 
support. 
I am so grateful for the unconditional love and unwavering support from my family 
members throughout my life, in particular my parents, Xuejun Lu and Hui Yin, without 
whom I would never have enjoyed so many opportunities. Last but certainly not the 
least, I would like to specially thank my wife, Zhi Chen. Zhi, I could not be able to 
finish this work without your love and support. Thank you for stepping into my life and 
being with me. 
 
V 
 
Abstract 
Grid integration of renewable sources including solar energy is growing faster than 
ever before. Nowadays, solar power development is increasing throughout the world, 
and solar photovoltaic (PV) systems play an important role to support the main loads 
and micro-grids. However, one needs to consider the long-term performance of PV 
components. Their deterioration can be caused by various factors such as ageing, 
weathering, the higher DC operating voltage level, improper installation, inadequate 
maintenance, etc. The consequence is a growing potential of electrical arcing incidents 
especially the series arc fault in PV systems. Without timely detection and interruption, 
such dangerous events can cause catastrophic fires, posing a severe threat to human 
safety and properties. 
In this thesis, a comprehensive review of DC arc fault and their diagnosis methods 
in PV systems is presented. Experimental study of DC series arc fault characteristics is 
carried out. The feasibility of applying deep learning (DL) in series arc fault detection in 
PV systems is systematically investigated. Specifically, convolutional neural networks 
(CNN) are successfully applied and demonstrate superior diagnosis performance over 
conventional machine learning algorithms and other popular DL algorithms. For cost-
effective real-time deployment, a lightweight CNN structure is designed to achieve a 
good balance between model complexity and detection accuracy. Moreover, novel 
frameworks, including domain adaptation and deep convolutional generative adversarial 
network (DA-DCGAN) and lightweight transfer convolutional neural network with 
adversarial data augmentation (LTCNN-ADA), are proposed. They aim to address the 
challenges when applying DL to practical applications, including lack of fault data from 
 
VI 
 
the field, data inconsistency between laboratory and field, and limited computation 
resources in edge devices. The proposed methods are validated through comprehensive 
offline analysis using pre-recorded data. In addition, the trained DL classification 
models are deployed in an embedded system and tested in single-phase and three-phase 
PV systems in real-time under different test conditions. Both offline and online 
experimental results show that the proposed methods can accurately and reliably detect 
series arc fault in PV systems. 
 
VII 
 
INCLUSION OF PUBLICATIONS STATEMENT 
UNSW is supportive of candidates publishing their research results during their candidature as 
detailed in the UNSW Thesis Examination Procedure. 
 
Publications can be used in their thesis in lieu of a Chapter if: 
• The student contributed greater than 50% of the content in the publication and is the 
“primary author”, i.e. the student was responsible primarily for the planning, execution and 
preparation of the work for publication 
• The student has approval to include the publication in their thesis in lieu of a Chapter from 
their supervisor and Postgraduate Coordinator. 
• The publication is not subject to any obligations or contractual agreements with a third party 
that would constrain its inclusion in the thesis 
 
Please indicate whether this thesis contains published material or not. 
☐ This thesis contains no publications, either published or submitted for publication 
☒ 
Some of the work described in this thesis has been published and it has been documented 
in the relevant Chapters with acknowledgement 
☐ 
This thesis has publications (either published or submitted for publication) incorporated 
into it in lieu of a chapter and the details are presented below 
 
CANDIDATE’S DECLARATION 
I declare that: 
• I have complied with the Thesis Examination Procedure 
• where I have used a publication in lieu of a Chapter, the listed publication(s) below 
meet(s) the requirements to be included in the thesis. 
Name 
Shibo LU 
 
Signature Date (dd/mm/yy) 
 
 
 
VIII 
 
Table of contents 
 
ACKNOWLEDGEMENT ................................................................................................. IV 
ABSTRACT .......................................................................................................................... V 
TABLE OF CONTENTS ................................................................................................ VIII 
LIST OF ACRONYMS .................................................................................................... XV 
LIST OF FIGURES ....................................................................................................... XVII 
LIST OF TABLES ....................................................................................................... XXIII 
1. INTRODUCTION ........................................................................................................ 1 
1.1. BACKGROUND AND RESEARCH MOTIVATION.......................................................... 1 
1.2. SUMMARY OF RESEARCH CONTRIBUTIONS ............................................................. 5 
1.3. THESIS ORGANISATION ........................................................................................... 7 
1.4. LIST OF PUBLICATIONS ............................................................................................ 8 
2. LITERATURE REVIEW .......................................................................................... 11 
2.1. INTRODUCTION...................................................................................................... 11 
2.2. DC ARC IN PHOTOVOLTAIC SYSTEMS ................................................................... 11 
2.2.1. Photovoltaic Systems Structure and Arc Hazards ............................................ 11 
2.2.2. Challenges to Detect DC Arc Faults ................................................................ 15 
2.3. DC ARC MODELS .................................................................................................. 17 
2.3.1. Physics-based Arc Model ................................................................................. 19 
 
IX 
 
2.3.2. V-I Characteristic-based Arc Model ................................................................ 20 
2.3.2.1. Nottingham Arc Model ........................................................................ 20 
2.3.2.2. Hall, Myer, and Viicheck Arc Model .................................................. 20 
2.3.2.3. Stokes and Oppenlander Arc Model .................................................... 21 
2.3.2.4. Paukert Arc Model ............................................................................... 21 
2.3.2.5. Modified Paukert Arc Model ............................................................... 22 
2.3.3. Heuristic8.840 8.856 8.851 
Band2 (12.5-25 kHz) 6.534 8.772 8.817 8.834 
Band1 (0-12.5 kHz) 8.961 8.283 8.700 8.743 
 
From the experiment, one can see that the higher the current, the faster the 
temperature rise in the arc column, and then the sooner arc phase 3 is reached as shown 
 
70 
in Figure 3.10. It also reveals the temperature dependency and preference. At higher 
temperature, the degree of thermal ionisation of the metal vapor and mixed gases is 
higher. For example, copper vapor starts ionisation above 4000 K [133]. The additional 
electrons enable the continued process of arc burning. 
Arc phase 1 Arc phase 2
Arc phase 1 Arc phase 2 Arc phase 3
Arc phase 1 Arc phase 3AP 2
 
Figure 3.10 DC current dependency for different arc phases 
3.4.3. Load Current Effect 
Based on experiment, the noise level of the low-current arc fault shows its 
dependency on circuit parameters, such as source voltage and load current. As shown in 
Figure 3.11, under fixed voltage and fixed gap distance, there are less high frequency 
variation at higher load current level. In addition, for the same V-I characteristic curve, 
when the load current level is higher, the stable operating point is much far away from 
the interrupted point (the circuit characteristic curve is tangent to the V-I characteristic 
 
71 
curve) as shown in Figure 3.12. It gives the stable operating point more margins, which 
accordingly increase the stability of the arc and thus less variation in the arc current 
signal. The wavelet packet entropy calculation for different load current under fixed 
source voltage is shown in Figure 3.13, Table 3.3, and Table 3.4. C1-C8 represents 
different series arc fault cases generated at different load current and source voltage 
levels (arc gap distance is 1 mm): C1: 4.1A/200V; C2: 7.9A/200V; C3: 11A/200V; C4: 
13.7A/200V; C5: 6.5A/87V; C6: 6.5A/111V; C7: 6.5A/134V; C8: 6.5A/158V. The 
results are similar as discussed in the previous section. More importantly, the entropy 
level of arc-state of bands 2-4 and 6-8 remains approximately unchanged when the load 
current changes. 
 
Figure 3.11 DC load current dependent arc spectrogram under fixed source voltage 
 
72 
Voltage 
Current
Fix gap distance
Interrupted point Stable operating point
 
Figure 3.12 V-I curve for fixed source voltage 
Table 3.3 Wavelet-packet entropy level (Non-arc state) for different load current and 
source voltage 
 
Non-arc state 
C1 C2 C3 C4 C5 C6 C7 C8 
Band8 (87.5-100 kHz) 6.140 5.813 5.774 5.838 6.208 6.171 6.767 6.347 
Band7 (75-87.5 kHz) 7.108 6.386 6.499 6.325 6.883 6.908 7.320 6.742 
Band6 (62.5-75 kHz) 6.189 5.780 5.807 5.657 6.522 6.476 7.030 6.576 
Band5 (50-62.5 kHz) 9.369 7.926 7.453 7.502 6.8985 7.449 7.845 8.734 
Band4 (37.5-50 kHz) 6.095 6.111 6.223 5.988 6.562 6.373 7.061 6.949 
Band3 (25-37.5 kHz) 6.661 6.108 6.313 6.174 6.456 6.305 7.048 6.638 
Band2 (12.5-25 kHz) 6.739 6.607 6.997 6.612 7.071 6.951 7.683 6.776 
Band1 (0-12.5 kHz) 9.130 9.164 8.855 8.921 9.525 9.280 9.168 9.379 
Table 3.4 Wavelet-packet entropy level (Arc state) for different load current and source 
voltage 
 
Arc state 
C1 C2 C3 C4 C5 C6 C7 C8 
Band8 (87.5-100 kHz) 8.692 8.695 8.745 8.845 8.718 8.539 8.539 8.541 
Band7 (75-87.5 kHz) 8.678 8.745 8.789 8.835 8.709 8.543 8.626 8.637 
Band6 (62.5-75 kHz) 8.712 8.606 8.493 8.791 8.703 8.427 8.418 8.305 
Band5 (50-62.5 kHz) 8.706 7.501 7.316 8.241 8.388 7.454 8.095 7.634 
Band4 (37.5-50 kHz) 8.643 8.650 8.773 8.743 8.692 8.604 8.671 8.603 
Band3 (25-37.5 kHz) 8.677 8.707 8.708 8.822 8.707 8.549 8.689 8.674 
Band2 (12.5-25 kHz) 8.861 8.789 8.787 8.685 8.710 8.650 8.730 8.640 
Band1 (0-12.5 kHz) 8.562 8.725 8.682 8.238 8.669 8.679 8.403 8.489 
 
 
73 
 
Figure 3.13 Wavelet-packet entropy under fixed DC source voltage 
 
3.4.4. Source Voltage Effect 
Under fixed load current and gap distance, there are less high frequency variation at 
higher source voltage level as shown in Figure 3.14-15. Similarly, the higher source 
voltage gives arcing more margins to the interrupted point. Therefore, the arc is more 
stable and generates fewer variation at higher source voltage level. On the other hand, 
as shown in Figure 3.16, Table 3.3, and Table 3.4, the entropy level of arc-state of band 
2-4 and band 6-8 remains approximately same with the changing DC source voltage. 
 
74 
 
Figure 3.14 DC source voltage dependent arc spectrogram under fixed load current 
Voltage 
Current
Fix gap distance
Interrupted point
Stable operating point
 
Figure 3.15 V-I curve for fixed load current 
 
75 
 
Figure 3.16 Wavelet-packet entropy under fixed load current 
 
3.4.5. Gap Distance Effect 
Similarly, as shown in Figure 3.4, with gap distance increasing, the V-I 
characteristic changes and finally two curves intercept at the interrupted point. When 
the gap distance exceeds the critical gap distance, the arc becomes unstable and then 
extinguished. Therefore, it is expected that the high-frequency variation will be more 
significant at longer gap distance as the stable operating point is closer to the interrupted 
point. As shown in Figure 3.17, the gap distance dependency has been revealed and the 
results are the same as expected. Although the gap distance is different, the entropy 
level remains roughly the same as shown in Table 3.5. The case of series arc faults 
generated at 6.5A/158V are used as examples. 
 
76 
 
Figure 3.17 Average frequency spectrum of the first 2 seconds data after the gap 
distance reached the desired value (FFT analysis window is 0.2 seconds) 
Table 3.5 Wavelet-packet entropy level for different gap distance (6.5A/158V) 
 1mm gap 2mm gap 3mm gap 
Band8 (87.5-100 kHz) 8.851 8.547 8.717 
Band7 (75-87.5 kHz) 8.637 8.529 8.685 
Band6 (62.5-75 kHz) 8.404 8.347 8.660 
Band5 (50-62.5 kHz) 7.634 7.418 7.664 
Band4 (37.5-50 kHz) 8.603 8.658 8.622 
Band3 (25-37.5 kHz) 8.674 8.506 8.687 
Band2 (12.5-25 kHz) 8.630 8.687 8.627 
Band1 (0-12.5 kHz) 8.489 8.505 8.412 
 
3.5. Discussion and Conclusion 
Based on experimental analysis, the arc current and its spectrum show dependency 
on the source voltage, load current, and gap distance. The arc fault tends to produce less 
arcing noise when the stable operating point is further away from the interrupted point: 
the high frequency variation induced by the arcing increase with decreasing source 
 
77 
voltage, decreasing load current, and increasing gap distance between the two 
electrodes. Therefore, when testing the effectiveness of AFD and AFCI in DC networks, 
the arc fault should be generated at high voltage and current level at a small gap 
distance to obtain the lowest level of arc noise, i.e. the worst-case scenario. In addition, 
the minimum threshold value for traditional detection methods can be determined under 
the worst-case scenario. In UL-1699B Outline, it does not properly take these important 
parameters into consideration for series arc fault testing. In the latest UL-1699B 
Standard, however, these parameters have been properly considered. For example, the 
gap distance range is reduced from 1.6-6.4mm to 0.8-2.5mm. 
For low-current arc fault, the time to reach the final arc phase is much longer 
compared to the required detection time listed in UL-1699B, which is up to 2 seconds in 
[14] and 2.5 seconds in [15]. Therefore, during the development stage of arc fault 
detection, developers could focus on the characteristics in the initial stage of arc faults. 
The results of this comprehensive experimental study also provide meaningful and 
useful information for testing the effectiveness and robustness of AFD and AFCI. 
Furthermore, by applying wavelet packet entropy analysis to all cases in the 
experiment in this chapter, it is found that this method can extract a consistent entropy 
pattern of series arc fault that is less sensitive to changes in source voltage, load current,gap distance, and arc phases. In addition, there are clear changes of entropy before and 
after arc fault occurrence. Such entropy features of arc fault current could be adopted 
for its more effective detection. 
 
 
 
 
78 
Some of the work described in this chapter has been published in: 
1. Shibo Lu, B. T. Phung, Daming Zhang, and Hua Chai, “An Experimental Study of 
Low-Current DC Series Arc Faults for Condition Monitoring Purpose,” International 
Conference and Exhibition on Electricity and Distribution (CIRED), Madrid, Spain, 3-6 
June 2019. 
2. Shibo Lu, B. T. Phung, and Daming Zhang, “Study on DC Series Arc Fault in 
Photovoltaic systems for Condition Monitoring Purpose,” Australasian Universities 
Power Engineering Conference (AUPEC), Melbourne, Australia, Nov. 2017. 
 
 
79 
4. DC Series Arc Fault Detection in PV systems using Deep 
Learning 
4.1. Introduction 
Nowadays, most of the traditional industries are transformed by digital Internet 
technology. This paradigm shift requires data-driven intelligent decision-making to 
enable automations with reduced operational risks. As an emerging field in industrial 
applications and an effective solution for intelligent decision-making, ML is 
increasingly being used and has demonstrated promising results in DC series arc fault 
detection as reviewed in Section 2.4.7. The general procedures for ML-based DC arc 
fault detection methods in PV systems are illustrated in Figure 4.1. To date, majority of 
studies mainly focus on the conventional ML methods, while DL techniques are not 
well investigated. 
This chapter systematically investigates the feasibility of applying DL in series arc 
fault detection in PV systems. A lightweight CNN structure is designed to improve 
detection accuracy and reduce the computation burden, which makes it more suitable 
for Internet-of-Things and edge computing applications. An experimental setup is 
established to collect series arc fault data under different operating conditions. A 
comparative study among different popular ML methods is then performed using the 
same dataset to demonstrate the effectiveness and superior performance of the proposed 
method. 
Finally, barriers obstructing intelligent fault diagnostics from being applied in 
practice are identified. Potential solutions in applying ML, especially DL, for practical 
PV series arc fault detection are presented. 
 
80 
Feature extractions by experts
• Time domain features
• Frequency domain features
• Time-frequency domain features
Feature extractions by experts (optional)
Conventional machine learning methods Deep learning methods
Dimension reduction & feature selection
• Principal component analysis
• Sparse representation
• Random forest
• ...
Classification
• Shallow artificial neural network
• Support vector machine
• Fuzzy inference system
• ...
Classification
• Deep artificial neural network
• Convolutional neural network
• Deep belief network
• Stacked autoencoder
• Recurrent neural network
• ...
Automatic feature learning by deep learning models
Time (s)
C
T
 C
u
rr
e
nt
 (
A
)
Arc inception
Data acquisition:
• Current (I)
• Voltage (V)
• V-I Curve
• Radio-frequency
• ...
Measurement system/Edge device
...
Voltage (V)
C
u
rr
en
t 
(A
) Normal 
MPPNew MPP
Arc 
inception
 
Figure 4.1 General procedures for ML-based DC arc fault detection methods 
 
4.2. Classical Machine Learning 
4.2.1. Artificial Neural Network 
ANN is inspired by biological neural networks and has been used for different fault 
diagnostics for decades. For example, multilayer perceptron (MLP) is a class of 
feedforward neural networks consisting of at least three fully connected layers (one 
input and one output with one or more hidden layers) of non-linearly activating nodes. 
Given a dataset, {𝒙𝑖, 𝒚𝑖}𝑖=1
𝑛 , of n samples, the corresponding label vector, and a k-layer 
 
81 
MLP (the number of hidden layers is 𝑘 − 2), the mathematical representation of the 
output for 𝑗𝑡ℎ layer, 𝑓𝑗(𝒙𝑖
𝑗
), is shown as follows: 
𝑓𝑗(𝒙𝑖
𝑗
) = 𝜎𝑗(𝒘𝑗𝑇
𝒙𝑖
𝑗
+ 𝑏𝑗) (4.1) 
where 𝜎𝑗 is the activation function, 𝒘𝑗 ∈ 𝒘 is the weight matrix, 𝑏𝑗 ∈ 𝒃 is the bias 
coefficient, 𝒙𝑖
𝑗
 is the input of 𝑗𝑡ℎ layer, and 𝑗 = 2, … , 𝑘 . Note that the number of 
neurons for hidden layers are flexible, while that of the input layer and output layer are 
identical to the dimension of input data and label vector, respectively. The 
backpropagation algorithm is widely used for training feedforward neural networks for 
supervised learning. Therefore, any ANN trained using backpropagation algorithm is 
also known as the BPNN. In general, BPNN is the most commonly selected type of 
ANN for different applications, and the general structure (MLP as an example) is shown 
in Figure 4.2. 
 
Hidden layers
Input layer Output layer
Feature 1
Feature 2
Feature d
Type 1
Type l
d is the dimension of the input array and l is the number of categories for output
x1
x2
xd
y1
yl
 
Figure 4.2 Structure of a back propagation neural network (MLP) 
 
82 
Given a portion of the dataset {𝒙𝑖, 𝒚𝑖}𝑖=1
𝑚 ∈ {𝒙𝑖, 𝒚𝑖}𝑖=1
𝑛 for training, the main 
objective of BPNN is to minimise the loss, L, between the predicted output and the label 
vector. There are different kinds of loss functions that can be used for training. For 
example, when the mean-square error, 𝐿𝑀𝑆𝐸 , is used, the objective function to be 
minimised of the BPNN is: 
min
𝒘,𝑏
 𝐿𝑀𝑆𝐸(𝒘, 𝒃) =
1
𝑚
∑[𝑓𝑘(𝒙𝑖
𝑘) − 𝒚𝑖]
2
𝑚
𝑖=1
 (4.2) 
After feeding the training dataset, the calculated error will be propagated backward 
all the way to the input layer to update the parameters of the BPNN using the gradient 
descent with the learning rate of 𝛿 shown as follows: 
𝒘 ← 𝒘 − 𝛿
𝜕𝐿(𝒘, 𝒃)
𝜕𝒘
,𝒃 ← 𝒃 − 𝛿
𝜕𝐿(𝒘, 𝒃)
𝜕𝒃
 (4.3) 
The principle of the backpropagation algorithm and gradient descent is shown in 
Figure 4.3 and Figure 4.4, respectively. 
 
Calculate L
მ f j-1
მ L მ L
=
Feed forward
Backpropagation
f j-1 
მ f j
მ f j
მ f j-1
f j 
მ w j
მ L მ L
=
მ f j
მ f j
მ w j
მ b j
მ L მ L
=
მ f j
მ f j
მ b j
მ f j-1
მ L
მ w j
მ L
მ b j
მ L
მ f j
მ L
b j 
w j 
 
Figure 4.3 Illustration of the backpropagation algorithm 
 
83 
Optimal learning rate Large learning rate Small learning rate
w
L(w)
Initial weight
Lmin(w)
Gradient
w
L(w)
Initial 
weight
w
L(w)
Initial weight
 
Figure 4.4 Gradient descent with different learning rate 
Gradient descent with a small learning rate requires significant number of 
iterations before converging to the minimum. On the other hand, if the learning rate is 
too large, gradient descent can overshoot the minimum, which can cause non-
convergence or even divergence. Therefore, the learning rate needs to be carefully 
selected through trial and error in order to achieve a desirable performance. 
ANNs offer several advantages, such as easily solving multi-classification 
problems and able to manage large amount of data and input variables. However, ANNs 
have low interpretability because of their black-box nature. In addition, for relatively 
large ANN models, they cannot always find the global optimum which makes them 
prone to be overfitted. 
4.2.2. Support Vector Machine 
SVM is a supervised learning algorithm, which is widely employed in many fault 
diagnostics. 
Take the non-kernel SVM (or linear SVM) as an example: consider a binary 
classification problem, given a dataset, {𝒙𝑖, 𝑦𝑖}𝑖=1
𝑛 , of n samples and the corresponding 
 
84 
label, where 𝒙𝑖 ∈ 𝒙 and 𝑦𝑖 ∈ {−1, 1}, a hyperplane 𝑓(𝒙) = 0 is chosen to separate the 
data into a positive and a negative group, shown as follows: 
𝒘𝑓(𝒙𝑖) = 𝒘𝑇𝒙𝑖 + 𝑏 = 0 (4.4) 
where 𝒘 and b are the parameters to determine the hyperplane. Then, a positive 
boundary and a negative boundary can be determined bythe closest point from the 
hyperplane in each group. Any point above the positive boundary is of one class with 
label 1, while any point below the negative boundary is of one class with label -1. After 
rescaling the distance of the closest point from the hyperplane in each group to be 1, the 
chosen hyperplane is subject to the following condition to separate the dataset: 
𝑦𝑖𝑓(𝒙i) = 𝑦𝑖(𝒘𝑇𝒙𝑖 + 𝑏) ≥ 1, 𝑖 = 1, 2, … , 𝑛 (4.5) 
As shown in Figure 4.5, in order to find the optimal hyperplane to achieve perfect 
separation, the margin 𝛾 = 2/‖𝒘‖ (the distance between the positive boundary and 
negative boundary) is expected to be maximised, where ‖∙‖ is the norm operator. As a 
result, the following optimisation problem is formulated for non-kernel SVM [135]: 
min 
𝒘,𝑏
 𝐶𝑜𝑠𝑡(𝒘, 𝑏) =
‖𝒘‖2
2
 
𝑠. 𝑡. 𝑦𝑖(𝒘𝑇𝒙i + 𝑏) ≥ 1, 𝑖 = 1, 2, … , 𝑛 
(4.6) 
Note that the square term in (4.6) is for computation optimisation purpose. For a 
dataset that is not linearly separable, a hyperplane with a soft margin can be found by 
adding regularisation terms in the cost function in (4.6). 
SVMs have a solid mathematical foundation in statistical learning theory. The 
solution for a typical SVM is a convex optimisation problem, which can always find the 
global optimum. Therefore, SVMs are less prone to overfitting problems compared to 
 
85 
ANNs. However, the main disadvantage is that SVMs are sensitive to the optimal 
choice of kernel. Additionally, they are computationally inefficient with a large dataset 
and does not work well when the number of input features is greater than the number of 
samples. Consequently, it requires more works in feature engineering for SVMs. 
Best margin
Optimal 
hyperplane
Class A Class B
Feature 1
F
ea
tu
re
 2
 
Figure 4.5 A simple linear SVM for classification 
4.2.3. Decision Tree and Random Forest 
DT is a supervised learning algorithm widely used in classification problems, 
which breaks the input space into regions with separate parameters for each region. It is 
a decision-making process by establishing the relationship between the attributes and 
the classes using a flowchart-like structure. A simple example of series arc fault 
detection using DT is visualised in Figure 4.6. 
DTs are usually unstable, and a tiny change in the data can cause a significant 
change in the optimal structure of DT. On the other hand, DTs are simple to interpret 
and easy to achieve good performance with simple input data, which are suitable for 
some specific applications. 
 
86 
RF is one particular tree-based model that mitigates overfitting in DT by 
integrating multiple DT-based classifiers. It is found to demonstrate good property of 
generalisation in series arc fault detection in different grid-connected PV systems 
recently [91]. 
Feature 1 satisfies 
condition 1?
Feature 2 satisfies 
condition 2?
Feature 3 satisfies 
condition 3?
Non-Arc
Series Arc Fault
Yes No
Yes
Yes No
No
A decision tree
Non-Arc
Non-Arc
 
Figure 4.6 A simple illustration of decision tree for series arc fault detection 
4.2.4. k-Nearest Neighbours 
Consider a training dataset, {𝒙𝑖, 𝑦𝑖}𝑖=1
𝑛 , of n samples with corresponding label, and 
an unlabelled testing dataset {𝒙𝑗}𝑗=1
𝑚 . The kNN classification algorithm firstly computes 
distance 𝑑(𝒙𝑖 , 𝒙𝑗) to every training example 𝒙𝑖 . One of the most common distances, 
Euclidean distance, is given in (4.7): 
𝑑(𝒙𝑖, 𝒙𝑗) = ‖𝒙𝑖 − 𝒙𝑗‖
2
 (4.7) 
Then, it selects k closest instances {𝒙𝑖1, … , 𝒙𝑖𝑘} and their labels {𝑦𝑖1, … , 𝑦𝑖𝑘} . 
Finally, it can output the most frequent label in {𝑦𝑖1, … , 𝑦𝑖𝑘} as 𝑦𝑝𝑟𝑒𝑑𝑖𝑐𝑡 for the input 
sample. For the kNN regression algorithm, the last step is modified: it computes the 
mean of {𝑦𝑖1, … , 𝑦𝑖𝑘} as follows: 
 
87 
𝑦𝑝𝑟𝑒𝑑𝑖𝑐𝑡 =
1
𝑘
∑ 𝑦𝑖𝑛
𝑘
𝑛=1
 (4.8) 
There are two main advantages for kNN: firstly, it is a very simple ML model and 
easy to implement; secondly, it has fewer hyperparameters to tune. However, the 
computation cost increases significantly when the sample size is large, and the value of 
k is difficult to select. Additionally, Goodfellow et al. report that the output of kNN on 
small training sets will essentially be random [136]. All those negative factors make the 
kNN method less popular. 
4.2.5. Others 
Besides the abovementioned algorithms, there are other conventional ML methods 
such as NB, FL, kNN, HMM, etc. 
4.3. Deep Learning 
Conventional ML classifiers with shallow structures require a powerful feature 
extractor that solves the selectivity-invariance dilemma [137], and they cannot be 
continually improved by increasing the size of the training data. DL is the most active 
development in the area of ML nowadays. The DL architecture has more similar 
working principle as biological nervous system of human. DL enables systems to 
discover complex features through learning many simple features, and the simple 
features are represented in terms of each other. To be specific, in the hidden layers, the 
output from the previous layer will be the input of the next layer, and finally many 
layers are cascaded. In this way, it discards the useless information and disturbance 
(disentangle the factor of variation) through deep structure, which leads to a better 
performance of the system. 
 
88 
DL is getting more attention because it requires less need for feature engineering 
and can achieve higher performance with the help of big data, and revolutions in 
algorithms and hardware. 
For example, the batch normalisation (BN) layer is developed to mitigate the 
problems that arise because of poor coefficient initialisation and it helps gradient flow 
in deeper models during training of the deep neural network using mini batches [138]. 
The mathematical representation of BN is as follows: 
𝑓𝐵𝑁(𝒙) = 𝛾 (
𝒙 − 𝜇𝑏
√𝜎𝑏
2 + 𝜀
) + 𝛽 (4.9) 
where x, µb, σb
2, γ, β, and ε denote input, mini-batch mean, mini-batch variance, scale 
factor, offset and stability parameter. Initially, the BN layer transforms the input to a 
mapping with zero mean and unit variance. After that, it shifts and scales that mapping 
with the learnable parameters, γ and β, to make it optimal for the successive layers in 
the deep neural network. In this way, the effects of internal covariate shift can be 
mitigated, and the training of the neural network can be stabilised and accelerated. 
Another good example of major algorithmic change is usage of ReLU activation 
function. Conventionally, sigmoid and tanh activation functions are widely used in 
shallow ANN. They tend to cause severe vanishing gradient problem because of 
saturation. Once saturation occurs, it becomes challenging for the learning algorithm to 
continue adjusting the weights to improve the performance of the model. The ReLU 
activation function overcomes the vanishing gradient problem because it is nearly linear 
and does not cause saturation. The models are also easier to optimise since their 
behaviour is closer to linear when using ReLU [136]. The mathematical representation 
of different activation functions used in this research are summarised in Table 4.1. 
 
89 
Table 4.1 Activation functions used in this thesis 
Name Representation Plot 
Sigmoid 𝑓(𝑥) =
1
1 + 𝑒−𝑥
 
 
Tanh 𝑓(𝑥) =
𝑒𝑥 − 𝑒−𝑥
𝑒𝑥 + 𝑒−𝑥
 
 
ReLU 𝑓(𝑥) = {
0, 𝑥 ≤ 0
𝑥, 𝑥 > 0
 
 
Leaky 
ReLU 
𝑓(𝑥) = {
𝛼𝑥, 𝑥 ≤ 0
𝑥, 𝑥 > 0
 
 
𝛼 = 0.2 as an example 
Softmax 
𝑓(𝒙) =
𝑒𝑥𝑖
∑ 𝑒𝑗𝐽
𝑗=1
 
for 𝑖 = 1, … , 𝐽 
N/A 
 
 
 
90 
4.3.1. Deep Fully-Connected Neural Network 
The simplest DL structure is the deep MLP, which consists of multiple fully-
connected layers for hierarchical feature extraction and a decision layer for 
classification (generally the softmax layer). The mathematical representations are 
similar to shallow MLP introducedin Section 4.2.1, while the number of hidden layers 
is generally greater than three. 
4.3.2. Autoencoder 
A simple auto-encoder consists of two parts: an encoder and a decoder. Given the 
input dataset {𝒙𝑖, 𝑦𝑖}𝑖=1
𝑛 with n samples, the encoding process and decoding process can 
be represented as follows: 
𝑓𝑒(𝒙𝑖) = 𝒉𝒊 = 𝜎𝑒(𝒘𝒆
𝑇𝒙𝑖 + 𝒃𝑒) (4.10) 
𝑓𝑑(𝒉𝒊) = 𝒙𝑖
′ = 𝜎𝑑(𝒘𝒅
𝑇𝒉𝒊 + 𝒃𝑑) (4.11) 
where the subscripts e and d represent the encoder and decoder; 𝒉𝒊 denotes the 
features extracted by encoder; 𝒙𝑖
′ is the reconstructed sample by decoder; 𝜃 = {𝒘, 𝒃} 
and 𝜎 are the network parameters and activation function, respectively. The 
optimisation objective of the auto-encoder is to minimise the reconstruction error of the 
input samples. Therefore, it can be optimised using the following cost function: 
min
𝜃𝑒,𝜃𝑑
 𝐶𝑜𝑠𝑡(𝜃𝑒 , 𝜃𝑑) =
1
𝑛
∑‖𝒙 − 𝒙𝑖
′‖2
𝑛
𝑖=1
 (4.12) 
There are typically two ways to construct an auto-encoder with deep structure. The 
first way is achieved by stacking multiple auto-encoders to form a stacked autoencoder 
(SAE). The output from the encoder part of the first auto-encoder is used as the input to 
the second auto-encoder. Then, greedy layer-wise pre-training is performed to establish 
 
91 
the SAE. The typical process to construct a SAE is visualised in Figure 4.7. 
Add classification layer and fine-tuning
Type 1
Type l
xi
y1
yl
hi,1 xi
~
Encoder
Decoder
Train the first AE Train the second AE
hi,1 hi,2
xi
hi,1
hi,1
~
hi,2
Train the last AE
hi,n-1 hi,n hi,n-1
~
hi,n-1 hi,n
Stack all the encoder parts of AEs
 
Figure 4.7 Diagram of a SAE for series arc fault detection 
After establishing the SAE with pre-training parameters, a decision layer (i.e. a 
softmax layer) is connected, and labels can be used to fine-tune the whole structure in a 
supervised way to achieve classification. The other method to construct an auto-encoder 
with deep structure is replacing a single layer with multiple layers in both encoder and 
decoder parts. 
 
92 
4.3.3. Convolutional Neural Network 
As shown in Figure 4.8, a CNN model typically consists of a convolution 
operators-based feature extractor, fully-connected layers for higher level reasoning, and 
a classification layer. The computation complexity of convolution layers is less 
compared to the fully-connected layers in terms of the required matrix multiplication 
operations because of the configuration method, where each neuron in a convolution 
layer is only connected to a small set of neurons in the following layer. Also, such a 
configuration makes CNNs excellent in extracting regional characteristics of the input 
sample. For an input matrix 𝒙𝑖
𝑗−1
 with P channels from the previous layer, K filters 
with size of 𝐻𝑓 × 𝐿𝑓, and step size 𝑠=1, the convolution operation of the 𝑘𝑡ℎ filter at the 
𝑗𝑡ℎ layer can be represented as follows: 
(𝒙𝑖
𝑗
)
ℎ𝑜,𝑙𝑜,𝑘
= 𝜎(𝒙𝑖
𝑗−1
∗ 𝒘𝑘
𝑗
+ 𝑏𝑘
𝑗
) 
= 𝜎(∑ ∑ ∑ (𝒙𝑖
𝑗−1
) 𝑠×ℎ𝑜+ℎ𝑓,𝑠×𝑙𝑜+𝑙𝑓,𝑝 × (𝒘𝑘
𝑗
)ℎ𝑓,𝑙𝑓,𝑝 + 𝑏𝑘
𝑗
𝐿𝑓−1
𝑙𝑓=0
𝐻𝑓−1
ℎ𝑓=0
𝑃−1
𝑝=0
) 
(4.13) 
where 𝒙𝑖
𝑗
 denotes the output feature map at the 𝑗𝑡ℎ layer with the size of 𝐻𝑜 × 𝐿𝑜 ×
𝐾 ; ℎ𝑜 = {1, … 𝐻𝑜} , 𝑙𝑖𝑝 = {1, … 𝐿𝑜} , 𝑘 = {1, … 𝐾} represent the row, column, depth 
index of the output feature map, respectively; 𝒘𝑘
𝑗
 and 𝑏𝑘
𝑗
 are the weight matrix and bias 
coefficient of 𝑘𝑡ℎ filter in 𝑗𝑡ℎ layer, respectively; 𝜎 denotes the activation function, 
which is typically ReLU for DNN since it can mitigate the gradient vanishing problem 
[139]. After each CNN layer, a pooling layer is usually used to achieve dimension 
reduction. Maxpooling layer is the most common pooling layer, which can be 
represented as follows: 
(𝒙𝑖
𝑗
)
ℎ𝑜,𝑙𝑜,𝑘
= 𝑚𝑎𝑥((𝒙𝑖
𝑗−1
)
ℎ𝑜:ℎ𝑜+𝐻MaxP−1,𝑙𝑜:𝑙𝑜+𝐿MaxP−1,𝑘
)) (4.14) 
 
93 
where the size of the max operator is 𝐻MaxP × 𝐿MaxP , the operation step size is 1, the 
size of the output feature is 𝐻𝑜 × 𝐿𝑜 × 𝐾. Then, the hierarchical features can be found 
by stacking several CNN layers and pooling layers. Next, the last pooling layer is 
flattened to 1D vector and connected to fully-connected layers for further reasoning. 
Finally, a classification layer (i.e. softmax layer) is connected to map the input sample 
into the target class. 
Convolutional 
kernel
Convolutional layer Pooling layer
Convolutional layer
Pooling layer
Pooling kernel
Fully connected 
layers
D
e
c
isio
n
 la
y
er
Feature extraction Reasoning
& Decision
 
Figure 4.8 General structure for a convolutional neural network 
 
4.3.4. Recurrent Neural Network 
The links of a recurrent neural network (RNN) between the nodes form a directed 
graph along a temporal sequence, which makes RNN capable of exploring the dynamic 
behaviour of time-series data. Long short-term memory (LSTM) model, consisting of 
many LSTM blocks, is one of the best-performing and most popular RNNs. The most 
beneficial part of LSTM is that it introduces an internal recurrence besides the outer 
recurrence in traditional RNNs. With such a modification, the LSTM model is easier to 
train since the gradient can flow for long durations [136]. An LSTM block contains 
 
94 
several units to control the flow of information, including a state unit 𝒔𝑖
𝑡, a forget gate 
unit 𝒇𝑖
𝑡, an external input gate unit 𝒈𝑖
𝑡, an output gate unit 𝒒𝑖
𝑡 for a time step t, layer 
index i, and the current input vector 𝒙𝑖
𝑡. Then, the mathematical representation of an 
LSTM block can be formulated as follows: 
𝒇𝑖
𝑡 = 𝜎𝑠𝑚(𝒖𝑖
𝑓𝑇
𝒙𝑖
𝑡 + 𝒘𝑖
𝑓𝑇
𝒉𝑖
𝑡−1 + 𝒃𝑖
𝑓
) (4.15) 
𝒔𝑖
𝑡 = 𝒇𝑖
𝑡𝒔𝑖
𝑡−1 + 𝒈𝑖
𝑡𝜎𝑡𝑎𝑛ℎ(𝒖𝑖
𝑠𝑇
𝒙𝑖
𝑡 + 𝒘𝑖
𝑠𝑇
𝒉𝑖
𝑡−1 + 𝒃𝑖
𝑠) (4.16) 
𝒈𝑖
𝑡 = 𝜎𝑠𝑚(𝒖𝑖
𝑔𝑇
𝒙𝑖
𝑡 + 𝒘𝑖
𝑔𝑇
𝒉𝑖
𝑡−1 + 𝒃𝑖
𝑔
) (4.17) 
𝒉𝑖
𝑡 = 𝜎𝑡𝑎𝑛ℎ(𝒔𝑖
𝑡)𝒒𝑖
𝑡 (4.18) 
𝒒𝑖
𝑡 = 𝜎𝑠𝑚(𝒖𝑖
𝑞𝑇
𝒙𝑖
𝑡 + 𝒘𝑖
𝑞𝑇
𝒉𝑖
𝑡−1 + 𝒃𝑖
𝑞) (4.19) 
where 𝒉𝒊, 𝒘𝒊, 𝒖𝒊, 𝒃𝒊 are the current hidden layer vector (it is also the output for the 
current LSTM unit), recurrent weights, input weights, and biases for the 𝑖th layer of 
LSTM block, respectively; the superscripts s, f, g, and, q indicate the correspondence of 
the parameters to different units; 𝜎𝑠𝑚 and 𝜎𝑡𝑎𝑛ℎ are the sigmoid and tanh activation 
function, respectively. An illustration of an LSTM unit is shown in Figure 4.9. After the 
information over time is obtained by the LSTM model (consisting of several LSTM 
modules), fully-connected layers are used to reason the output of the LSTM model in 
many-to-one mode or in many-to-many mode. Finally, a softmax layer is connected at 
the end to achieve classification. 
There is an extended LSTM model called bidirectional LSTM (Bi-LSTM). 
Basically, Bi-LSTM combines two independent LSTMs together, which allows the 
whole network to have both backward and forward information about the input 
sequence at every time step [140]. This type of model generally gives better 
performance when the context of the input is needed. 
 
95 
× +
+
σ σ σ 
tanh
tanh
st-1,i
st,i
ht-1,i 
wf,i uf,i
bf,i +
wg,i ug,i
bg,i +
ws,i us,i
bs,i +
wq,i uq,i
bq,i
×
×
LSTM block for time step t and layer i
ht,i 
yt,i
xt,i 
L
S
T
M
 a
t 
ti
m
e 
st
ep
 t
-1
L
S
T
M
 a
t 
ti
m
e 
st
ep
 t
+
1
LSTM at time step t and layer i+1
 
Figure 4.9 Diagram of a LSTM block in a LSTM model 
4.4. Experimental Setup 
Arc fault experiments are performed using the experimental system shown in 
Figure 4.10. Different from the experimental setup established in Chapter 3, the DC 
power source is replaced by a Magna power TSD-1000-20/415 programmable DC 
power supply (PV emulator) and the resistive load is replaced by a Sunny Boy 1.5 
single-phase inverter. Thus, a 1.5-kW emulator-based grid-tied PV system is 
established. Thepurpose of using a PV emulator is to achieve different operating 
condition (e.g. different irradiance and PV cell temperature) in a more controllable 
environment [141]. Similarly, sensors comprising a PROSys-CP35 differential current 
probe, an SI-9000 differential voltage probe, and a Pearson-4688 CT are used to sense 
the loop current, arc voltage, and high-frequency information of the loop current, 
respectively. All the electrical signals are streamed to the personal computer using a 
DAQ system which consists of a National Instruments PXIe-1073 and a PXIe-4300 
module with 200-kHz sampling rate and 16-bit analog-to-digital conversion resolution. 
 
96 
Flat-tip cylindrical copper rod electrodes with 6.35 mm (1/4 in.) diameter are used in 
the arc generator, and a Nema-42 servo motor with a controller based on an Arduino 
Uno and A4988 motor driver are employed to accurately control the electrode 
separation speed and distance. 
 
 
Figure 4.10 Schematic diagram of the experimental setup 
 
The series arc faults are generated between the positive terminal of the PV emulator 
and the positive terminal of the solar inverter with a separation rate of 5 mm/s and a 
separation distance of 0.5 mm. To obtain series arc fault signals at different conditions, 
the same experiment is repeated with different combination of irradiance level from 
400 𝑊/𝑚2 to 1000 𝑊/𝑚2 and temperature from 0 𝐶° to 45 𝐶°. 
 
 
 
97 
4.5. Proposed Deep Learning based Series Arc Fault Detection Method using 
Convolutional Neural Network 
Traditional fault diagnosis techniques need to extract the important statistical 
parameters from the raw data manually to achieve fault identification. This process is 
known as feature extraction. Researchers have proposed several feature extraction 
techniques to identify series arc faults based on domain-specific knowledge as reviewed 
in Chapter 2. In the conventional ML based series arc faults detection approaches, the 
extracted features are fed into ML classifiers for decision making as illustrated in Figure 
4.1. The main drawback of these approaches is that the decision accuracy heavily 
depends on the input features [91]. In addition to that, the domain experts need to invest 
considerable time on feature learning for each fault condition. Even so, there is no 
guarantee that the extracted features can fully represent the unique characteristics of 
series arc faults under different operating conditions. 
DL enables automated hierarchical feature learning from data and it proves its 
success in industrial applications from various domains such as image processing [142], 
speech recognition [143], time series sensor data analytics [144] etc. Recently, 
researchers have started to apply DL based data analysis techniques to enhance the 
operations in industrial applications. Zeng et al. [145] proposed a two-stream multi-rate 
RNN for pedestrian identification with the aid of two CNNs to extract the spatial and 
motion features from raw video frames. Sun et al. [146] came up with a DL based 
sparse deep stacking network that can eliminate the overfitting of DL models in motor 
fault diagnosis applications. Guo et al. [147] exploited the CNNs combined with 
continuous wavelet transformation to detect earth faults via transforming the fault 
current signal into time-frequency grey-scale images. Jiang et al. [148] proposed 
multiscale CNN based DL architecture to automate the fault feature extraction from raw 
 
98 
vibration signals of wind turbine gearbox. Zhao et al. [149] developed deep residual 
networks with dynamically weighted wavelet coefficients to enhance the fault diagnosis 
of planetary gearboxes. Zhao et al. [150] combined time domain hand-engineered 
features with bidirectional gated recurrent unit networks to monitor machine health 
conditions. Different DL architectures and input features are used in the aforementioned 
researches based on their application context. The choice of architecture and the feature 
set are key requirements to design a viable solution. 
Inspired by these successes of DL, particularly the successes of CNN, CNN based 
intelligent detection is proposed for series arc fault detection in PV systems without any 
hand-crafted features. The proposed method requires little prior knowledge on signal 
characteristics. Some application-specific requirements and constraints are addressed in 
the rest of thesis. 
4.5.1. Dataset Preparation 
Different types of disturbance exist in different frequency ranges in PV systems as 
summarised in Section 2.2.2. Above 100-kHz, the signals are mainly affected by radio-
frequency noise [57], [131]. In the lower frequency range, switching noise is one of the 
main concerns that can potentially cause nuisance tripping of AFD/AFCI. For example, 
solar inverters can generate switching noise and its harmonics from 1 kHz to above 100 
kHz [57], [58], [108], [151], [152]. Based on extensive experimental studies carried out 
by Sandia National Labs, and considering the pink noise nature of arc faults (the power 
spectral density is inversely proportional to the frequency of the signal) and severe 
radio-frequency interference in the higher frequency range, the frequency band of 0.1-
100 kHz is recommended for detection purpose even though the signal is inevitably 
affected by the switching noise [57]. 
 
99 
Although a higher sampling rate offers more information about the signals, it 
significantly increases the computation load of the algorithm, which is not practical for 
real-time implementation. Many recent studies have also obtained good results using 
frequencies less than 10 kHz (at sampling frequency of 20 kHz) for DC arc fault 
detection [58], [103], [152]. Therefore, the PV loop current signals from CT are down-
sampled to 20 kHz and used for series arc fault detection. Using such a CT can 
eliminate low-frequency external disturbances (typically less than 1 kHz) caused by 
sand, mechanical vibrations, etc. [58]. Further, a case study is presented in Chapter 5, 
which confirms that the choice of using 20 kHz as sampling frequency is appropriate. 
All the raw CT time-series signals are divided into windowed samples with 𝑙2 
points and scaled into [-1, 1] for standardisation. After that, two-dimensional (2D) 
arrangement is performed to arrange each windowed sample to a 2D matrix with size of 
𝑙 × 𝑙 [153], [154]. The standardisation and 2D arrangement are achieved using (4.20) 
and (4.21). 
𝓍(𝑖, 𝑗) = 
𝓍𝑟𝑎𝑤(𝑙(𝑗 − 1) + 𝑖) − min (𝓍𝑟𝑎𝑤)
max(𝓍𝑟𝑎𝑤) − min (𝓍𝑟𝑎𝑤)
 (4.20) 
𝓍(𝑖, 𝑗) = 2 × 𝓍(𝑖, 𝑗) − 1 (4.21) 
where 𝓍𝑟𝑎𝑤 is the raw time-series segment and 𝓍 is the standardised 2D sample. 
Similarly, with increasing window sizes, more useful information can be extracted from 
the input signal by the algorithms at the expense of increasing computation complexity 
and detection latency. The value of l, which represents the height/width of the squared 
2D matrix, is 20 based on considerations of the work from other researches [39], [58], 
and a case study with different value of l is presented in Section 5.4. Ultimately, 20,000 
normal samples and 20,000 arcing samples are extracted to form the dataset for the case 
study. Each sample consists 400 data points, corresponding to 20 ms data duration. Note 
 
100 
that these samples are prepared by partitioning many time-series signals of different 
cases into several 20 ms-duration sections without any overlapping. 
4.5.2. Hyperparameters Setting and Offline Validation Results 
There are several well-known CNN architectures such as, LeNet 5 [155], AlexNet 
[156], VGG 16 [157], etc. [158]. LeNet 5 is the first popular CNN architecture proposed 
in 1998, and it is a relatively small CNN compared to today’s standards. Because of its 
limited capability of representation learning (aka feature learning), it is less popularin 
some complex tasks such as large image recognition nowadays. However, it is quite 
suitable for Internet-of-Things applications because of its relatively small number of 
parameters. AlexNet is similar to LeNet 5 but with a deeper structure. It is the first CNN 
architecture that stack convolution layers directly on top of each other. After that, CNNs 
start to become deeper in order to improve the performance. Both AlexNet and VGG 16 
demonstrate good results for many complex problems such as image classification and 
localisation. However, the main drawback associated with deep CNN, such as AlexNet 
and VGG 16, is the use of millions of parameters (e.g. 60 million and 138 million 
parameters in the original applications of AlexNet and VGG 16, respectively). 
Therefore, it is computationally expensive and difficult to be deployed in real-time in a 
resource-constraint edge device. The architectures of LeNet 5, AlexNet, and VGG 16 
are illustrated in Figure 4.11. 
Firstly, it is worthwhile to investigate the effectiveness of these deep CNNs in 
series arc fault detection in PV systems. The original input size for LeNet 5, AlexNet, 
and VGG 16 are 32 × 32 × 1, 227 × 227 × 3, and 224 × 224 × 3, respectively, while 
their minimum required input size are 9 × 9 × 1 , 37 × 37 × 3 , and 32 × 32 × 3 , 
respectively. 
 
101 
Input
224×224×3
3×3 Conv.
224×224×64
3×3 Conv.
224×224×64
2×2 MaxPool
112×112×64
3×3 Conv.
112×112×128
3×3 Conv.
112×112×128
2×2 MaxPool
56×56×128
3×3 Conv.
56×56×256
3×3 Conv.
56×56×256
2×2 MaxPool
28×28×256
3×3 Conv.
56×56×256
3×3 Conv.
28×28×512
3×3 Conv.
28×28×512
2×2 MaxPool
14×14×512
3×3 Conv.
28×28×512
3×3 Conv.
14×14×512
3×3 Conv.
14×14×512
2×2 MaxPool
7×7×512
3×3 Conv.
14×14×512
Flatten
Dense
25088
Dense
4096
Dense
4096
1000 (Class)
Original input 64 Filters 128 Filters 256 Filters 256 Filters 512 Filters
Input
32×32×1
5×5 Conv.
28×28×6
2×2 AveragePool
14×14×6
Original input
Zero-padding applied
5×5 Conv.
10×10×16
2×2 AveragePool
5×5×6
Flatten
Dense
400
Dense
120
Dense
84
10 (Class)
6 Filters 16 Filters
No zero-padding
(c) VGG 16
(a) LeNet 5
Input
227×227×3
11×11 Conv.
55×55×96
3×3 MaxPool
27×27×96
3×3 Conv.
13×13×384
3×3 Conv.
13×13×384
3×3 Conv.
13×13×256
Flatten
Dense
43264
Dense
4096
Dense
4096
1000 (Class)
Original input 96 Filters 256 Filters 384 Filters 256 Filters 512 Filters
Zero-padding applied
(b) AlexNet
5×5 Conv.
27×27×256
3×3 MaxPool
13×13×256
Stride=4
MaxPool Stride = 2
AveragePool Stride = 2
MaxPool Stride = 2
 
Figure 4.11 Original CNN architecture: (a) LeNet 5; (b) AlexNet; (c) VGG 16 
 
102 
In order to apply LeNet 5, AlexNet, and VGG 16 to this application, where the size 
of input sample is 20 × 20 × 1, the following modifications to the CNN models and 
input samples are carried out: 
• For LeNet 5, it can be easily implemented by only replacing the original 
classification layer by a 1-neuron fully-connected classification layer, because 
series arc fault detection is a binary classification problem. No modifications are 
needed for the input samples. 
• For AlexNet, the filter size and stride in the first convolution layer are changed 
from 11 × 11 to 9 × 9 and from 4 to 1, respectively. Similarly, the output layer is 
replaced by a 1-neuron fully-connected classification layer. Each original sample is 
transformed from 20 × 20 × 1 to 20 × 20 × 3 by replicating itself three times 
along the channel axis. 
• For VGG 16, the strides in the last 3 max pooling layers (there are 5 max pooling 
layers in total) are changed from 2 × 2 to 1 × 1 . Similarly, the output layer is 
replaced by a 1-neuron fully-connected classification layer and each original 
sample is transformed from 20 × 20 × 1 to 20 × 20 × 3 by replicating itself three 
times along the channel axis. 
Besides the modifications mentioned above, BN is applied to each convolution 
layer and fully-connected layer (except the classification layer) in order to improve the 
speed, performance, and stability of ANN models especially for deep ANN models 
[138]. The results are shown in Section 4.5.2.5. 
There are two main requirements for real-time DC series arc fault detection in PV 
systems: 
• It requires time-sensitive data processing, 
 
103 
• It can be deployed at the edge: the overall computation burden of the designed 
algorithm cannot exceed the computation capability of the edge device. 
As a result, to address these application-specific requirements and constraints 
mentioned above, it is necessary to design an optimal CNN structure that can achieve a 
balance between required computation efforts and performance on series arc fault 
detection. There are several tunable hyperparameters in a typical CNN structure, 
including the number of convolution layers, the number of filters in each convolution 
layer, the size of filter, the number of fully connected layers, and the number of neurons 
in each fully connected layer. The optimal lightweight CNN structure is shown in 
Figure 4.12 and Table 4.2. 
 
 
N2
 data points window
Minmax Normalisation
&
2D Arrangement
C
o
n
v
o
lu
ti
o
n
 L
ay
er
 (
5
 b
y
 5
)
B
at
ch
 N
o
rm
al
is
at
io
n
 L
ay
er
+
 R
eL
U
 L
ay
er
+
 M
ax
 P
o
o
li
n
g
 L
ay
er
 (
2
 b
y
 2
)
F
u
ll
y
 C
o
n
n
ec
te
d
 L
ay
er
+
 B
at
ch
 N
o
rm
al
is
at
io
n
 L
ay
er
+
 R
eL
U
 L
ay
er
F
u
ll
y
 C
o
n
n
ec
te
d
 L
ay
er
+
 C
la
ss
if
ic
at
io
n
 L
ay
er
Real-time data captured by CT
F
u
ll
y
 C
o
n
n
ec
te
d
 L
ay
er
+
 B
at
ch
 N
o
rm
al
is
at
io
n
 L
ay
er
+
 R
eL
U
 L
ay
er
-0.5
-1
0
0.
5
1
Point 1
j=1,...,N
i=1,..., N
Point (j-1)N+i
Feature visualization in 
convolution layer
Feature visualization in 
max pooling layer
 
Figure 4.12 The optimal lightweight CNN structure and feature visualisation 
 
104 
Table 4.2 Structure and parameters of the optimal lightweight CNN 
L
ay
er
 N
o
. 
L
ay
er
 t
y
p
e 
K
er
n
el
 s
iz
e 
N
o
. 
o
f 
k
er
n
el
 
S
tr
id
e 
Z
er
o
-p
ad
d
in
g
 
B
N
 
A
ct
iv
at
io
n
 
O
u
tp
u
t 
sh
ap
e 
1 2D Convolution 5×5 3 1 No Yes ReLU 16×16×3 
2 Maxpooling 2×2 1 2×2 No No - 8×8×3 
3 Dense 8 1 - - Yes ReLU 8 
4 Dense 5 1 - - Yes ReLU 5 
5 Dense 1 1 - - No Sigmoid 1 
 
The optimal CNN structure is determined by hyperparameter tuning through trial 
and error based on following extensive studies. For training, the dataset is divided into a 
training dataset and a testing dataset with a ratio of 50%:50%. Some other training 
parameters are fixed for all case studies in this chapter: the learning rate is 0.001, the 
maximum number of epochs is 100, the batch size is 64, the optimiser is stochastic 
gradient decent (SGD), and categorical cross-entropy is used as the loss function. The 
training process and offline validation are performed using a CentOS 7 Linux operating 
system with a Tesla P100 GPU and an Intel (R) Xeon (R) Gold 6126 CPU in the rest of 
chapters. The following metrics are used to evaluate the performance of the algorithms: 
𝐴𝑟𝑐𝑖𝑛𝑔 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
× 100% (4.22) 
𝑁𝑜𝑟𝑚𝑎𝑙 =
𝑇𝑁
𝑇𝑁 + 𝐹𝑁
× 100% (4.23) 
𝑆𝑒𝑛𝑠𝑖𝑏𝑖𝑙𝑖𝑡𝑦 =
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
× 100% (4.24) 
𝑆𝑎𝑓𝑒𝑡𝑦 =
𝑇𝑁
𝑇𝑁 + 𝐹𝑃
× 100% (4.25) 
𝑂𝑣𝑒𝑟𝑎𝑙𝑙 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃 + 𝑇𝑁
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
× 100% (4.26) 
where TP, TN, FP, and FN represent the number of correct detections of fault, correct 
 
105 
detections of normal conditions, undetected faults, and nuisance tripping, respectively. 
Sensibility, which is also known as precision, is an indicator to measure the system 
sensitivity related to normal operating conditions and normal transient events (the rate 
to avoid false alarm). Safety, which is also known as recall, is an indicator to measure 
the system sensitivity to DC series arc fault (the rate toavoid missing alarm). 
4.5.2.1. Size of Filter 
To demonstrate the effect of filter size on CNN performance, other settings of the 
CNN in Table 4.2 are kept the same. As shown in Table 4.3, the filter size is set to 
3 × 3, 5 × 5, 7 × 7, and 9 × 9. As the filter size increases from 3 × 3 to 5 × 5, the 
testing accuracy is greatly improved. Also, it can be seen that a larger filter size 
introduces fewer parameters in the CNN structure. However, it is found that increasing 
the filter size further would lead to a decline in the accuracy, especially arcing accuracy. 
Thus, the optimal filter size is 5 × 5. 
Table 4.3 Influence of filter size on CNN performance 
Filter size Arcing Normal Sensibility Safety 
Overall 
Accuracy 
 
Number of 
parameters 
3 × 3 99.10% 99.36% 99.36% 99.10% 99.23% 2097 
𝟓 × 𝟓 99.46% 99.99% 99.99% 99.46% 99.72% 1737 
7 × 7 99.20% 99.99% 99.99% 99.21% 99.60% 1449 
9 × 9 99.13% 99.96% 99.96% 99.14% 99.54% 1223 
 
4.5.2.2. Number of Filters in the Convolution Layer 
Similarly, to investigate the impact of the number of filters in the convolution layer, 
the other parts in the optimal CNN structure are fixed. As presented in Table 4.4, the 
overall accuracy of CNN improves substantially by 18.51% when the number of filters 
increases from 1 to 2. When it rises from 2 to 3, there is still a noticeable enhancement 
 
106 
in arcing accuracy from 98.78% to 99.46%. As can be seen, a slight improvement can 
be obtained when the number of filters is further increased. However, the trade-off is 
significant growth in the total number of parameters, which is not cost-effective. 
Therefore, 3 filters are selected in the first CNN layer. 
Table 4.4 Influence of number of filters on CNN performance 
Number of filters 
(5 × 5) 
Arcing Normal Sensibility Safety 
Overall 
Accuracy 
 
Number of 
parameters 
1 81.73% 80.04% 80.37% 81.42% 80.88% 653 
2 98.78% 99.99% 99.99% 98.79% 99.39% 1195 
3 99.46% 99.99% 99.99% 99.46% 99.72% 1737 
5 99.54% 99.98% 99.98% 99.54% 99.76% 2821 
7 99.49% 99.99% 99.99% 99.49% 99.74% 3905 
 
4.5.2.3. Number of Convolution Layers 
In general, when there is adequate training data, more convolution layers can lead 
to better performance since higher level and more generalised features can be learned 
by the CNN [136]. As expected in this case study, CNN with two convolution layers 
can improve the arcing accuracy from 99.46% to 99.57% as shown in Table 4.5. 
However, the computation burden increases significantly: the CNN with two 
convolution layers has about 4.4 times as many parameters as that with one convolution 
layer. Therefore, one convolution layer is sufficient to achieve high performance with 
low computation requirements. 
Table 4.5 Influence of number of convolution layers on CNN performance 
Convolution layer 
settings 
Arcing Normal Sensibility Safety 
Overall 
Accuracy 
 
Number of 
parameters 
3-MP 99.46% 99.99% 99.99% 99.46% 99.72% 1737 
3-MP-6 99.57% 99.99% 99.99% 99.57% 99.78% 7593 
3-MP-6-12 99.62% 99.99% 99.99% 99.62% 99.80% 8685 
 
107 
4.5.2.4. Number of Fully Connected Layers and Number of Neurons in Each Layer 
Similarly, CNNs with two fully connected layers followed by a classification layer 
(the arcing accuracies are above 99%) is better than CNNs with one fully connected 
layer followed by a classification layer (the arcing accuracies are less than 99%) as 
shown in Table 4.6. When the number of fully connected layer is fixed, as the number 
of neurons in each fully connected layer increases, CNNs tend to demonstrate slightly 
better performance at the expense of substantially increasing number of parameters. 
Therefore, the proposed setting of fully connected layer is optimal. 
Table 4.6 Influence of fully connected layer settings on CNN performance 
Fully connected 
layer settings 
Arcing Normal Sensibility Safety 
Overall 
Accuracy 
 
Number of 
parameters 
8-1 98.74% 99.96% 99.96% 98.76% 99.35% 1675 
16-1 98.73% 99.99% 99.99% 98.75% 99.36% 3259 
32-1 98.81% 99.99% 99.99% 99.82% 99.40% 6427 
8-5-1 99.46% 99.99% 99.99% 99.46% 99.72% 1737 
16-10-1 99.48% 99.99% 99.99% 99.48% 99.73% 3463 
32-20-1 99.53% 99.97% 99.97% 99.53% 99.75% 7155 
64-40-1 99.50% 99.96% 99.96% 99.50% 99.73% 15499 
 
4.5.2.5. Comparison with Very Deep CNNs 
The proposed lightweight CNN is also compared to deeper CNNs. The settings and 
modifications of these deep CNNs are discussed in the beginning of Section 4.5.2, and 
the results are presented in Table 4.7. It is worth mentioning that very deep CNNs are 
difficult to train without BN, generally caused by gradient vanishing. Modified AlexNet 
and modified VGG 16 without BN fail to converge most of the times. Also, the 
performance of LetNet 5 without BN (overall accuracy of 99.37%) is worse than 
LeNet5-BN (overall accuracy of 99.84%). Although deep CNNs can achieve near-
perfect overall accuracy, it is not possible to deploy such complex models in real-time 
 
108 
in a cost-effective manner. For example, the overall accuracy for modified VGG 16-BN 
is 99.95%, while the number of parameters is about 40 million. 
Table 4.7 Performance comparison of different CNNs 
Filter size Arcing Normal Sensibility Safety 
Overall 
Accuracy 
 
Number of 
parameters 
LetNet 5 99.15% 99.58% 99.58% 99.15% 99.37% 28589 
LeNet 5-BN 99.74% 99.94% 99.94% 99.74% 99.84% 29493 
Modified AlexNet-BN 99.84% 99.99% 99.99% 99.84% 99.91% 24757761 
Modified VGG 16-BN 99.91% 99.99% 99.99% 99.91% 99.95% 39941313 
Proposed CNN 99.46% 99.99% 99.99% 99.46% 99.72% 1737 
 
In summary, the proposed lightweight CNN is proven to be optimal, striking a good 
balance between accuracy and computation burden, and it is well suited for real-time 
deployment in resource-constraint edge devices. 
4.5.3. Evaluation of Different ML Classifiers 
Although several ML algorithms demonstrate promising results in recent literature, 
the performance of these ML classifiers cannot be directly compared because the 
datasets used in such investigations are different. Therefore, a comparative study among 
different popular ML algorithms using the same datasets is performed and presented in 
this section. Their effectiveness in DC series arc fault detection in PV systems is 
examined. 
4.5.3.1. Datasets Preparation 
One of the datasets is the same as the one described in Section 4.5.1, where no 
manual feature extraction is performed on the raw CT signals. 
In addition, to investigate the influence of feature extraction, a 2D feature map is 
designed based on the wavelet packet entropy as described in Chapter 3. The fault 
 
109 
signal is analysed frame-by-frame to extract the common patterns. After performing 3-
level wavelet packet entropy calculation to a frame of data, a feature vector with the 
size of 8 × 1 is formulated. To standardise the data, self-normalisation is performed 
with a cumulative value of 1. The key point to note during the feature extraction process 
is that the extracted features should be both sufficient to identify the faults and immune 
to false tripping. Since the fault current nature includes random variation as its 
properties, the feature needs to be more reliable against false tripping. To include more 
temporal information and improve the reliability of the proposed feature, 6 adjacent 
frames with 50% overlapping are merged into a 2D feature map. Each feature map is 
calculated using 0.07-second CT signal under 20 kHz sampling frequency, and the 
frame size is 400 (corresponding to 0.02 s duration). 
The detailed formulation process of 2D feature map based on normalised wavelet 
packet entropy is shown in Figure 4.13. Some examples of normal and arcing 2D 
feature map are also visualised. It can be seen that high entropy values tend to be 
concentrated in the intermediate frequency bands (band 3-7) in the normal state, whilst 
theytend to be concentrated in the lower frequency bands (band 1-4) in the arcing state. 
This gives an intuitive indication that they have some distinctive features from each 
other. Therefore, the designed feature map is suitable as input to different ML classifier 
to obtain higher-level features for classification. 
Ultimately, 20,000 arcing feature maps and 20,000 normal feature maps are 
prepared as the second dataset. 
 
110 
Frame size
(400 points)
Band 8
Band 7
Band 6
Band 5
Band 4
Band 3
Band 2
Band 1N
o
rm
al
is
ed
 w
av
el
et
 
p
ac
k
et
 e
n
tr
o
p
y
 v
ec
to
r
Band 8
Band 7
Band 6
Band 5
Band 4
Band 3
Band 2
Band 1
Band 8
Band 7
Band 6
Band 5
Band 4
Band 3
Band 2
Band 1
50% overlapping
8 by 6 2D feature map
Visualisation of normal 2D feature map
Visualisation of arcing 2D feature map
 
Figure 4.13 Feature extraction process and visualisation of normal/arcing feature map 
 
111 
4.5.3.2. Settings for Different ML Classifiers 
For comparative study, conventional ML methods including the shallow BPNN, 
NB, SVM, and RF, are selected. In order to obtain the optimal performance, grid search 
and five-fold validation is performed during the training of each conventional ML 
classifier [115]. Note that the 2D samples are flattened before being fed into these ML 
classifiers. The settings of conventional ML classifiers for both datasets are as follows: 
• For SVM: Gaussian, linear, polynomial (cubic), sigmoid kernels are considered; 
regularisation parameter is set to 0.01, 0.1, 1, and 10; the gamma parameter is set to 
auto, which is 1 divided by the number of features. 
• For NB: Gaussian kernel is selected. 
• For shallow BPNN: ReLU and sigmoid are selected as activation functions. Two 
hidden layers are used, and the number of each hidden layers is determined by trial 
and error. Generally, the number of neurons in the first hidden layer is identical to 
the number of features in the input layer, and the number of neurons in the second 
hidden layer is reduced by half. 
• For RF: The maximum depth of the tree is set to 10, 30, 50, 80, and 100; the 
maximum number of features to consider when looking for the best split is set to 2, 
3, log2(number of input features), and square root of the number of input features; 
minimum samples required to be at a leaf node is set to 3, 5, and 10; the minimum 
number of samples required to split an internal node is set to 6, 10, and 20; the 
number of trees in the forest is set to 100, 200, and 400. 
Furthermore, DL methods, including deep BPNN, SAE, LSTM, Bi-LSTM, and 
CNN, are chosen. The common settings for the training of neural networks are the same 
as described in Section 4.5.2. Similarly, 2D samples are flattened before being fed into 
the deep BPNN. Also, a BN-ReLU layer is applied after each fully connected (except 
 
112 
the final classification layer), convolution, LSTM or Bi-LSTM layer in this case study. 
The neural network structures are also determined by extensive trial and error. 
The optimal settings of different DL models for the first dataset are as follows: 
• For deep BPNN: it consists of four hidden layers. The number of neurons in the 
first hidden layer is identical to the number of features in the input layer, and the 
number of neurons in the second hidden layer is reduced by half. The last two 
hidden layers and the classification layer are the same as the designed optimal 
lightweight CNN in Table 4.2, which is the 8-5-1 structure. 
• For SAE: in the encoder part, it consists of three fully connected layer with number 
of neurons of 200, 100, and 50, respectively, i.e. for each additional fully connected 
layer, the number of neurons is reduced by half. 
• For LSTM: it consists of two LSTM layers followed by the 8-5-1 structure for 
fully-connected layers. The first LSTM layer has 20 units, and the type of 
connection between the first LSTM layer and the second LSTM layer is many-to-
many. The second LSTM layer has 10 units, and the type of connection between 
the second LSTM layer and the next fully-connected layer is many-to-one. 
• For Bi-LSTM: it has the same structure as the LSTM model except the two LSTM 
layers are replaced by two Bi-LSTM layers. 
• For CNN: it has the same structure as shown in Table 4.2. 
For the second dataset, the settings in first few layers are modified based on the 
input since the input size changes from 20 × 20 to 8 × 6 (e.g. for LSTM, the number of 
units in the first and the second layers are changed from 20 to 10, and 10 to 5, 
respectively). In order to maintain the number of parameters approximately the same, 
the 8-5-1 structure is changed to 16-10-1 for the second dataset. 
 
113 
4.5.3.3. Results of Comparative Study 
The results are presented in detail in Table 4.8. The same metrics, as presented in 
(4.22) to (4.26), are used to evaluate the performance of different ML algorithms. 
When raw data (the first dataset) is used as input, DL algorithms, including LSTM, 
Bi-LSTM, and CNN demonstrate superior diagnosis performance compared to other 
ML methods. The shallow BPNN with sigmoid activation function only reaches 72.24% 
accuracy, while the situation becomes much better when ReLU activation function is 
used. This is because ReLU introduces sparsity into the network for regularisation, and 
redundant features can be effectively discarded. The deep BPNN further increases the 
overall accuracy since hierarchical and more robust features have been learnt by the 
deep structure. Conventional methods such as NB, SVM, and RF show mediocre 
performance as expected, because the input dimension is too large. The CNN with the 
proposed lightweight structure achieves the best detection accuracy with an overall 
accuracy of 99.72%. 
When effective feature extraction is applied (the second dataset), all conventional 
ML methods can be significantly improved. Among them, SVM demonstrates the best 
overall detection accuracy of 97.55%. However, the performance of DL method is 
degraded due to possibility of losing some useful information during the manual feature 
extraction process. Likewise, CNN achieves the best detection accuracy with an overall 
accuracy of 97.95%. 
Among all the tested ML methods, CNN achieves the best overall classification 
accuracy regardless of the presence of feature extraction or not. Furthermore, DL 
models with raw data as input can achieve best-in-class overall accuracy. 
 
 
114 
Table 4.8 Evaluation of different popular ML methods 
Method Arcing Normal Sensibility Safety 
Overall 
Accuracy 
Using raw data as input (0.02 seconds, 20-kHz sampling rate) 
Shallow BPNN (Sigmoid) 62.30% 82.18% 77.76% 68.55% 72.24% 
Shallow BPNN (ReLU) 89.75% 97.46% 97.25% 90.48% 93.61% 
NB 72.34% 71.42% 71.68% 72.08% 71.88% 
SVM 80.89% 87.96% 87.04% 82.15% 84.42% 
RF 85.81% 84.84% 84.99% 85.67% 85.32% 
Deep BPNN 92.56% 98.44% 98.34% 92.97% 95.50% 
SAE 93.95% 98.21% 98.13% 94.20% 96.08% 
LSTM 98.18% 99.48% 99.47% 98.20% 98.83% 
Bi-LSTM 98.95% 99.31% 99.31% 98.95% 99.13% 
CNN 99.46% 99.99% 99.99% 99.46% 99.72% 
Using 2D feature map based on normalised wavelet packet entropy as input 
(0.02*3.5 seconds, 20-kHz sampling rate) 
Shallow BPNN (Sigmoid) 93.97% 88.36% 88.98% 93.61% 91.16% 
Shallow BPNN (ReLU) 95.23% 95.37% 95.36% 95.24% 95.30% 
NB 94.07% 88.93% 89.47% 93.75% 91.50% 
SVM 96.87% 98.23% 98.21% 96.91% 97.55% 
RF 96.64% 96.38% 96.39% 96.63% 96.51% 
Deep BPNN 96.99% 98.31% 98.29% 97.03% 97.65% 
SAE 97.27% 98.50% 98.48% 97.30% 97.88% 
LSTM 92.81% 97.82% 97.71% 93.15% 95.32% 
Bi-LSTM 96.52% 96.35% 96.36% 96.51% 96.44% 
CNN 97.85% 98.05% 98.05% 97.85% 97.95% 
 
4.5.4. Real-time Implementation and Validation Results 
For real-time validation experiments, the PV emulator is programmed to simulate a 
1.5-kW grid-tied PV system with open-circuit voltage (𝑉𝑜𝑐) of 207.2 V and short-circuit 
current ( 𝐼𝑠𝑐 ) of 7.95 A at standardtest condition (STC). Additionally, a 21 μH 
inductance is connected between the PV emulator and the 1.5-kW single-phase solar 
inverter. The reason to include this inductance is to simulate the filtering effect of the 
PV connection cables [14], which can also test the ability of generalisation of the 
proposed method. 
For real-time implementation, a final decision operator is typically introduced to 
 
115 
strike a balance among reliability, accuracy, and response speed [52], [159]. For 
instance, when the detection algorithm outputs 4 successive arcing events, a final 
decision will be made [52]. Although the final detection time is 4 times longer than the 
original detection time, it can greatly reduce the false tripping rate. 
In this thesis, multiple-window strategy based on binomial distribution is adopted. 
A flow chart of the overall real-time series arc fault detection is illustrated in Figure 
4.14. Here, 𝛼 and 𝛽, assumed to be binomial distributed, are the mis-operation rate and 
mal-function rate of every output from the LTCNN classifier. The improved mis-
operation rate and mal-function rate, 𝑃𝑚𝑖𝑠 and 𝑃𝑚𝑎𝑙 , can be calculated by (4.27) and 
(4.28): 
𝑃𝑚𝑖𝑠 = ∑ 𝐶𝑘
𝑖
𝑚−1
𝑖=0
(1 − 𝛼)𝑖𝛼𝑘−𝑖 (4.27) 
𝑃𝑚𝑎𝑙 = 1 − ∑ 𝐶𝑘
𝑖
𝑚−1
𝑖=0
𝛽𝑖(1 − 𝛽)𝑘−𝑖 (4.28) 
where 𝐶𝑘
𝑖 is the binomial coefficient, and 𝑘 ≥ 𝑖 ≥ 0. By varying the values of m and k, 
one can adjust the false tripping rate and the malfunction rate. k is chosen based on the 
worst case detection time, which is quite flexible to the users as long as the time limit 
satisfies the absolute time limit indicated in UL-1699B Standard, which is 𝑇𝑅 =
max( 750/(𝐼𝑎𝑟𝑐 × 𝑉𝑎𝑟𝑐) , 2.5) seconds [15]. Considering the fact that the detection 
time of existing arc fault detectors is in the order of hundreds of ms, k=10 is chosen to 
get the worst-case detection time of 200 ms. After choosing the value of k, the value of 
m can be manually selected to achieve satisfactory performance. For example, given the 
accuracies of the proposed CNN in Table 4.7, 𝛼 = 0.54% and 𝛽 = 0.01% can be 
obtained. Then, 𝑃𝑚𝑖𝑠 and 𝑃𝑚𝑎𝑙 are both almost 0 using (4.25) and (4.26) with m=3 and 
k=10. Similar results can be obtained with m=2 and k=10, which indicates that m=2 is 
 
116 
sufficiently accurate; however, m=3 is chosen to make the detection algorithm more 
immune to false tripping. Therefore, the final alarm signal will be issued when at least 
m=3 samples are determined as arcing in a sliding window consisting of k=10 samples 
with a step size of 1. Thus, great improvements in reliability and accuracy can be 
achieved. Based on the tests using pre-recorded time-series data without shuffling, it is 
verified that no wrong decisions occurred. 
Real-time current data captured by CT
Minmax normalisation (i)
Output of the LTCNN classifier, O(i)
Raw Input (i), 20ms sliding window
(i is the window index)
2D Arrangement (i)
FlagCount(i)=count1([O(i), 
O(i-1), O(i-k+1)])
If FlagCount(i) > (m-1)
Series arc fault detection
i++
If i > k-1
No
Yes
Yes
No
DL based classification
Accuracy & reliability 
improvement
FlagCount(i)=count1([O(i), 
O(i-1), , O(1)])
 
Figure 4.14 Flowchart of real-time series arc fault detection 
Then, the trained CNN is deployed in a prototype based on an NI-CompactRIO-
9030 real-time embedded system and tested in various conditions. NI-CompactRIO is a 
general purpose real-time embedded industrial controller (with NI Linux real-time 
operating system), which is convenient and flexible for developing prototype and proof-
of-concept. The overall resource utilisation of NI-CompactRIO for the deployment of 
 
117 
the proposed CNN is less than 10%. For mass production and commercialisation, the 
algorithm could be implemented in field programmable gate arrays or application-
specific integrated circuits, which could result in significant cost reduction. For 
example, low-cost devices such as Kendryte K210 and Zynq-7020 based development 
systems, which cost in range of $ 30-200 AUD, have the capabilities of operating more 
complex CNNs as compared to the proposed CNN. Therefore, low-cost real-time 
implementation can be easily achieved. The real-time test results are presented in the 
rest of this section. The oscilloscope screen display shows 4 different signals: 
• Internal output digital signal of CNN (yellow trace, CH1) 
• Loop current capture by the current probe with a ratio of 1:10 (green trace, CH2) 
• Loop current captured by the CT, which is the input signal (blue trace, CH3) 
• Output digital signal of the final decision with the multiple-window strategy (pink 
trace, CH4). 
The quick sudden changes in irradiance can cause some AFCI/AFD to trip because 
they determine the series arc faults by monitoring the rapid changes in current signal 
[37], [58]. In Figure 4.15, the irradiance level is reduced from 1000 W/m2 to 500 W/
m2 with a step of 125 W/m2 using PV Power Profile Emulation software to simulate 
some small disturbances. In Figure 4.16, a relatively large disturbances with a large 
current drop (approximately 50%) is produced by changing the irradiance level from 
1000 W/m2 to 600 W/m2 . The proposed algorithm does not trip on these step 
changes. Throughout the experiment, although a mis-operation event is generated from 
CNN during the normal conditions as shown in Figure 4.17, the final decision of the 
algorithm is still correct due to the multiple-window strategy. 
In Figure 4.18, an inrush current is obtained caused by the inverter initialisation 
operation. After about 30 seconds, the inverter starts to adjust the operating point to the 
 
118 
maximum power point, and the signature of inverter noise varies dramatically as 
illustrated on the blue trace. During this start-up period, the proposed algorithm does 
not experience any unwanted tripping. 
Step changes caused by irradiance level changes
No response
 
Figure 4.15 Response to small step changes induced by irradiance level changes 
A large step change caused by irradiance level change
No response
 
Figure 4.16 Response to a relatively large step change 
 
119 
A mis-operation 
during the normal states
Final decision is correct
Step changes caused by irradiance level changes
 
Figure 4.17 A mis-operation is experienced during normal conditions 
Inverter initialisation
transient
Inverter start-up and
MPPT operation
 
Figure 4.18 Response to start-up transients and MPPT operation of the inverter 
 
 
 
120 
Not only the proposed algorithm is able to respond to the series arc fault generated 
during the inverter start-up period in 56 ms as shown in Figure 4.19, but it can also 
detect short-term intermittent series arc faults accurately within 60 ms as shown in 
Figure 4.20 and Figure 4.21. In Figure 4.22, even though a mal-function event is 
experienced from CNN during the arcing conditions, the final decision of the algorithm 
is correct. This event might be because of the arc noise reduction caused by long-term 
arc burning. The higher temperature makes arcing more stable and generates less 
variation in the loop current signal [77], thus hiding the arc signatures. 
 
 
A series arc fault during 
inverter start-up
54 ms
 
Figure 4.19 Response to series arc fault during inverter start-up and MPPT operation 
 
121 
Several intermittent 
series arc faults
54 ms
58 ms
Sustained
arcing
No response
 
Figure 4.20 Response to several intermittent series arc faults followed by a sustained 
arcing 
53 ms 58 ms
Several intermittent 
series arc faultsStep change due to sudden 
irradiance level change
Arc extinguish
No response
 
Figure 4.21 Response to several intermittent series arc faults followed by an arcing 
with increasing gap distance 
 
 
122 
Step change due to sudden 
irradiance level change Sustained arc fault
A mal-operationduring the arc states
Final decision is correct
 
Figure 4.22 A malfunction experienced during arc conditions 
4.6. Discussion and Recommendations 
As demonstrated in Table 4.8, ML methods, especially DL methods, can achieve 
excellent DC series arc fault diagnosis accuracy when sufficient labelled data is 
available for training. However, this is generally not the case for many practical 
applications [160], [161]. In the following, the practical challenges are discussed, and 
some potential solutions are presented. 
4.6.1. Imbalanced Dataset or Small Dataset 
Training ML classifiers with an imbalanced or small dataset can severely affect the 
classification performance. Researchers in [115] report on the impacts of the number of 
training samples on classification accuracy. It is found that the diagnosis accuracy 
reduces with fewer training samples for all types of ML methods investigated in [115]. 
ML classifiers trained with imbalanced dataset tend to concentrate on classifying the 
majority class (consist of sufficient samples) while neglecting the minority class 
 
123 
(consist of insufficient samples). This is the case in reality; the fault data in the target 
system are usually rare or even not available. To mitigate this problem, a common way 
is to carry out simulation or superimpose the normal signals with Gaussian/Pink noise 
to enlarge the fault dataset [103], [131], [162]. However, it is still unclear whether the 
artificial fault samples superimposed with such noises share the same features with real 
fault signals or not. For example, Andrea et al. in [163] have shown that the Gaussian 
noise used in their arc fault simulation is not a good choice, especially for time-domain 
fluctuation patterns. Using a high-quality dataset is a pre-requisite for ML methods, 
especially for end-to-end DL algorithms without feature extraction. As a result, 
advanced data augmentation techniques need to be developed and investigated to create 
high-quality synthetic samples to enlarge the dataset. 
Another possible way is to use transfer learning (e.g. domain adaptation). Transfer 
learning aims to leverage the knowledge of a well-defined domain (e.g. laboratory 
system where sufficient data can be obtained) to enhance the performance of the ML 
models on a target domain task with less required training samples (e.g. in-service 
system where few labelled fault data can be obtained). Although good results have been 
achieved through transfer learning in other fields of fault diagnosis [161], it has not 
been utilised for DC series arc fault detection in PV systems. 
4.6.2. Inconsistency between Training and Testing Dataset 
In the majority of studies with ML methods, the dataset is divided into a training 
dataset and a testing dataset with certain ratios (e.g. 80% to 20% or 50% to 50%) under 
the following important assumptions: (1) the training dataset and testing dataset have 
different data distribution; (2) the testing dataset can represent the data encountered in 
field operation. However, this is often not the case, especially for practical applications. 
 
124 
For example, Xia et al. observed the same phenomena for DC series arc fault detection 
in electrical vehicle systems caused by various reasons, such as different types of loads 
[110], different voltage/current levels [109], and different arc gap distance [109]. This 
problem could be mitigated by establishing a classification model trained on a complex 
dataset with all required cases. In this case, the testing data is drawn from the same 
distribution with the training data. However, it is extremely time-consuming and even 
unrealistic to gather enough data from a large number of cases. Furthermore, the ML 
classifier trained with a complex dataset generally performs worse than that trained with 
a simple dataset [146]. Therefore, transfer learning may provide a more feasible 
solution to tackle this problem effectively. 
The investigation of such phenomenon is yet to be conducted in PV applications of 
DC arc fault detection, thus the motivation for this research as detailed in the following 
chapter. 
4.6.3. Unlabelled Dataset 
The majority of researchers are focusing on supervised learning, which indicates 
that the datasets need to be fully annotated. Compared with labelled datasets which 
require manual annotations from domain experts, unlabelled datasets are much easier to 
obtain [116]. Moreover, the use of transfer learning has demonstrated promising results 
in massive unlabelled datasets in fault diagnostics fields [161]. Consequently, using 
unlabelled data to develop effective DL methods can be a worthy research area in the 
future. 
4.6.4. ML Model Complexity and Real-time Capability 
The computation complexity of ML algorithms should be considered for cost-
 
125 
effective real-time deployment in resource-constraint edge computing devices (e.g. 
distributed on-line monitoring system or intelligent arc fault detector). The computation 
complexity and the accuracy of ML models mainly depend on the architecture of the 
model, the sampling rate for data acquisition, the size of the input samples, etc. 
Although DL approaches can outperform conventional ML methods in fault diagnostics 
in many cases, accuracy improvements are at the expense of exponentially increasing 
computation resources as well as energy consumption. Therefore, the complexity of the 
problem as well as resource constraints of the specific application need to be taken into 
account in the development of the ML models, especially for DL models. As a result, it 
is recommended to report essential information such as training/testing time and 
sensitivity to hyperparameters of the models, which can enable direct comparison across 
different models [164]. On the other hand, researches on efficient hardware 
implementation in low-cost devices, such as field programmable gate arrays and 
application-specific integrated circuits, can also pave the way to realize cost-effective 
solutions based on DL methods for on-line intelligent fault diagnostics. 
4.6.5. Model Interpretability 
Even though DL methods can achieve top-rated accuracy compared to conventional 
ML methods, they are less amenable to interpretation because of their deeper structures 
[136]. The lack of core understanding restrains the development of DL on a 
fundamental level. The DL models are selected based on trial and error instead of 
rigorous logical theories. To address this, some prior works in [165], [166] carried out 
comprehensive theoretical analysis about interpreting and understanding DL models. 
These studies shed some light on selecting the optimal sets of hyper-parameters for DL 
models, which can help to achieve more reliable and robust DC series arc fault 
 
126 
detection performance. 
4.7. Conclusion 
This chapter presents a DL-based approach to detect DC series arc faults in PV 
systems, utilising a CNN with a 2D raw data matrix as input. To the best of the author’s 
knowledge, this is the first time DL is applied in the context of DC series arc faults 
detection. A lightweight CNN structure is designed through hyperparameter-tuning to 
facilitate cost-effective real-time implementation in practice. Based on the off-line 
validation results, with sufficient training data, the proposed CNN is capable of 
detecting series arc faults under different experimental conditions with an overall 
accuracy exceeding 99%. A comparative study on different popular ML classifiers is 
carried out using the same dataset, and the CNN demonstrates the best-in-class 
detection accuracy. Furthermore, a higher level of accuracy can be achieved if a more 
complex CNN model is used. 
The proposed algorithm is deployed in a prototype monitoring system to verify the 
real-time operations under different conditions with a PV emulator based single-phase 
grid-tied system. ThereArc Model ......................................................................................... 22 
2.3.4. High-Frequency Variation Caused by Arc Faults ............................................ 24 
2.4. DC ARC FAULTS DETECTION METHODS IN PV SYSTEMS...................................... 25 
2.4.1. Sensors for Measurement ................................................................................. 26 
2.4.2. Fast Fourier Transform .................................................................................... 26 
2.4.3. Short Time Fourier Transform ......................................................................... 30 
2.4.4. Wavelet Transform ........................................................................................... 31 
2.4.5. Statistical Analysis ........................................................................................... 35 
2.4.6. Model-based Methods ...................................................................................... 39 
2.4.7. Machine Learning based Methods ................................................................... 40 
2.4.8. Other Types of Methods ................................................................................... 46 
2.5. DISCUSSION AND CONCLUSION ............................................................................. 52 
3. CHARACTERISTICS STUDY ON DC SERIES ARC FAULT ........................... 55 
3.1. INTRODUCTION...................................................................................................... 55 
3.2. EXPERIMENTAL SETUP .......................................................................................... 56 
 
X 
 
3.3. STATIC CHARACTERISTICS .................................................................................... 59 
3.3.1. V-I Characteristics ........................................................................................... 59 
3.3.2. Stable Operating Point ..................................................................................... 61 
3.3.3. Load Current Effect and Source Voltage Effect .............................................. 63 
3.4. HIGH-FREQUENCY VARIATION IN ARC CURRENT .................................................. 64 
3.4.1. Wavelet Packet Entropy ................................................................................... 64 
3.4.2. Effect of Arc Phase .......................................................................................... 66 
3.4.3. Load Current Effect .......................................................................................... 70 
3.4.4. Source Voltage Effect ...................................................................................... 73 
3.4.5. Gap Distance Effect ......................................................................................... 75 
3.5. DISCUSSION AND CONCLUSION ............................................................................. 76 
4. DC SERIES ARC FAULT DETECTION IN PV SYSTEMS USING DEEP 
LEARNING ........................................................................................................................ 79 
4.1. INTRODUCTION...................................................................................................... 79 
4.2. CLASSICAL MACHINE LEARNING .......................................................................... 80 
4.2.1. Artificial Neural Network ................................................................................ 80 
4.2.2. Support Vector Machine .................................................................................. 83 
4.2.3. Decision Tree and Random Forest ................................................................... 85 
4.2.4. k-Nearest Neighbours ....................................................................................... 86 
4.2.5. Others ............................................................................................................... 87 
4.3. DEEP LEARNING .................................................................................................... 87 
 
XI 
 
4.3.1. Deep Fully-Connected Neural Network ........................................................... 90 
4.3.2. Autoencoder ..................................................................................................... 90 
4.3.3. Convolutional Neural Network ........................................................................ 92 
4.3.4. Recurrent Neural Network ............................................................................... 93 
4.4. EXPERIMENTAL SETUP .......................................................................................... 95 
4.5. PROPOSED DEEP LEARNING BASED SERIES ARC FAULT DETECTION METHOD 
USING CONVOLUTIONAL NEURAL NETWORK .................................................................... 97 
4.5.1. Dataset Preparation .......................................................................................... 98 
4.5.2. Hyperparameters Setting and Offline Validation Results .............................. 100 
4.5.2.1. Size of Filter ....................................................................................... 105 
4.5.2.2. Number of Filters in the Convolution Layer ...................................... 105 
4.5.2.3. Number of Convolution Layers ......................................................... 106 
4.5.2.4. Number of Fully Connected Layers and Number of Neurons in Each 
Layer ............................................................................................................................ 107 
4.5.2.5. Comparison with Very Deep CNNs .................................................. 107 
4.5.3. Evaluation of Different ML Classifiers .......................................................... 108 
4.5.3.1. Datasets Preparation........................................................................... 108 
4.5.3.2. Settings for Different ML Classifiers ................................................ 111 
4.5.3.3. Results of Comparative Study ........................................................... 113 
4.5.4. Real-time Implementation and Validation Results ........................................ 114 
4.6. DISCUSSION AND RECOMMENDATIONS ............................................................... 122 
 
XII 
 
4.6.1. Imbalanced Dataset or Small Dataset ............................................................ 122 
4.6.2. Inconsistency between Training and Testing Dataset .................................... 123 
4.6.3. Unlabelled Dataset ......................................................................................... 124 
4.6.4. ML Model Complexity and Real-time Capability ......................................... 124 
4.6.5. Model Interpretability .................................................................................... 125 
4.7. CONCLUSION ....................................................................................................... 126 
5. INTELLIGENT DC SERIES ARC FAULT DETECTION IN PV SYSTEMS 
USING DA-DCGAN WITHOUT TARGET-DOMAIN FAULT DATA .................... 128 
5.1. INTRODUCTION.................................................................................................... 128 
5.2. EXPERIMENTAL SETUP ........................................................................................ 130 
5.2.1. Experimental Setup and Conditions in Source Domain ................................. 130 
5.2.2. Experimental Setup and Conditions in Target Domain ................................. 130 
5.3. PROPOSED DA-DCGAN ..................................................................................... 132 
5.3.1. Generative Adversarial Networks .................................................................. 132 
5.3.2. Optimisation Procedures and Deep Learning Model Structures .................... 133 
5.4. CASE STUDY 1: OFFLINE VALIDATION RESULTSis no unwanted tripping experienced during the inverter start-up, 
MPPT operation, or current step changes. Furthermore, the proposed algorithm can 
accurately detect both short-term intermittent and sustained series arc faults in a timely 
manner. The sensitivity, effectiveness, and robustness of the proposed algorithm are 
confirmed through both offline and real-time validations. 
Finally, detailed recommendations and potential solutions are provided to address 
several problems that prevent intelligent DC series arc fault detection from being 
applied to real-world engineering applications. 
 
 
127 
Some of the work described in this chapter has been published in: 
1. Shibo Lu, Animesh Sahoo, Rui Ma, and B. T. Phung, “DC Series Arc Fault 
Detection using Machine Learning in Photovoltaic Systems: Recent Developments and 
Challenges,” International Conference on Condition Monitoring (CMD), Phuket, 
Thailand, 25-28 Oct. 2020. 
2. Shibo Lu, Hua Chai, Animesh Sahoo, and B. T. Phung, “Condition Monitoring 
based on Partial Discharge Diagnostics using Machine Learning Methods: A 
Comprehensive State-of-the-Art Review,” accepted for publication in IEEE 
Transactions on Dielectrics and Electrical Insulation, in press, 3 July 2020. 
 
 
 
 
128 
5. Intelligent DC Series Arc Fault Detection in PV Systems 
using DA-DCGAN without Target-Domain Fault Data 
5.1. Introduction 
With the recent advances in computing methodologies and information technology, 
data-driven ML-based methods become increasingly popular and demonstrate 
promising results in fault diagnosis task in many fields such as high impedance fault 
detection in medium voltage networks [167], failure detection in electrical machines 
[154], and track circuit fault in railway systems [168]. A number of recent studies have 
achieved good results for DC series arc fault detection using conventional ML methods 
as reviewed in Section 2.4.7. In Chapter 4, the effectiveness of various DL algorithms 
has been examined, and they demonstrate superior detection accuracy over conventional 
ML methods. Although these ML methods, especially DL methods, can achieve 
excellent accuracy, one of the main issues is domain-switching, which can cause severe 
performance degradation as discussed in Section 4.6.2. For example, some algorithms 
are developed using data collected from power-electronics based DC power sources (i.e. 
PV emulators). Even though such power sources can reproduce similar DC output 
characteristics to that of real PV systems, the specific features of signal may vary 
depending on the types of power source as well as the structure of the experimental 
setup [37], [169]. In addition to that, the electrical arc signals can be masked and 
modified because of the parasitic capacitance and inductance contributed by real PV 
modules and circuit cables [131], [169]. Different domain adaptations as well as transfer 
learning techniques have been developed in other fields to address domain-shifting 
problems, such as handwriting-digit recognition [170] and object recognition [171], 
[172]. Recently, they have been applied to fault diagnosis, such as gear fault detection 
 
129 
in gearbox systems using relatively small datasets with deep CNN-based transfer 
learning [173], machine fault diagnosis using deep one-dimensional CNN-based 
transfer learning network with unlabelled data [174], and cross-domain fault diagnosis 
using sparse auto-encoder and fine-tuning with target-domain data [175]. 
However, most of these investigations assume the target-domain fault data is 
available and sufficient, and this is a major challenge. In practice, obtaining sufficient 
series arc fault data in a real PV system is costly and time-consuming. Furthermore, 
even though supervisory control and data acquisition systems can provide information 
about PV systems, they are not much useful due to low-sampling frequency and 
intensive efforts for extracting useful arc fault signals from huge amount of unlabelled 
data. Because of these reasons, the performance of data-driven ML algorithms often 
degrades dramatically when they are applied to a different domain (i.e. from laboratory 
to field). 
In this chapter, an effective methodology, DA-DCGAN, is proposed to address the 
performance degradation of the DL based detection algorithm in a different domain. 
Target-domain fault data can be artificially generated based on normal signals, and they 
can be applied for domain adaptation to achieve reliable and accurate diagnosis of 
cross-domain DC series arc faults. It is more applicable to the practical situations in 
some real industries and applications, where only the source-domain data (both normal 
and faults data) and the target-domain normal data are available for detection algorithm 
development. It could significantly reduce the effort (i.e. collecting arc fault signal) 
during the algorithm development stage. Furthermore, a lightweight CNN classifier is 
employed for cross-domain fault diagnosis instead of using a very deep neural network 
as in other papers, which could be an appropriate solution to enable real-time DC series 
arc fault diagnosis in different practical environments. 
 
130 
The proposed methodology is formulated under the following conditions: 
1. Sufficient normal data and arcing data can be obtained in the source domain 
(laboratory setup based on PV emulator), while only sufficient normal data can be 
obtained in the target domain for training (real grid-connected PV systems). Target-
domain arcing data is collected for testing only. 
2. The source-domain data and target-domain data have different distributions 
because the system parameters, characteristics, and operating conditions are 
different. 
5.2. Experimental Setup 
5.2.1. Experimental Setup and Conditions in Source Domain 
For the preparation of the source-domain data, the same experimental setup as in 
Section 4.4 is used: a 1.5-kW emulator-based grid-connected PV system. The same 
dataset, which has been presented in Section 4.5.1, is used as the source-domain dataset 
for DA-DCGAN case study. It consists of 20,000 normal samples and 20,000 arcing 
samples with a total size of 40,000. Each sample consists 400 data points (corresponds 
to 0.02-second). 
5.2.2. Experimental Setup and Conditions in Target Domain 
For target-domain data collection and real-time testing, the PV emulator is replaced 
by a rooftop PV string consists of four JINKO JKM350M-72 mono-crystalline PV 
panels. The key information of PV panels can be found in Table 5.1. Additionally, more 
than 82 meters of 6 𝑚𝑚2 regular solar cables are wired for connections from the 
rooftop to the PV connection box. The overall experimental setup and schematic 
representation are illustrated in Figure 5.1. Series arcing experiments are performed at 
 
131 
different times on a sunny day from 11 am to 5 pm to obtain target-domain arcing and 
normal sample at different current levels. It should be emphasised that series arc faults 
are generated at the start, middle, and end of the PV string. These fault locations are 
recommended for series arc fault detection tests by UL-1699 B [3]. Ultimately, 25,000 
normal samples and 5,000 arcing samples are extracted to form the target-domain 
dataset, where 20% of randomly chosen normal samples are reserved for offline testing 
and the rest of them are used for training and generating dummy target-domain arcing 
samples. 
 
 
Figure 5.1 Experimental setup and schematic representation for target-domain data 
collection and real-time validation tests 
 
132 
Table 5.1 Specification of PV model JINKO (JKM350M-72) at STC 
Rated Power 𝐼𝑀𝑃𝑃 𝑉𝑀𝑃𝑃 𝐼𝑆𝐶 𝑉𝑂𝐶 
350 W 8.94 A 39.10 V 9.38 A 47.5 V 
 
Note that every time-sequence sample in both source domain and target domain is 
pre-processed using [-1, 1] minmax normalisation and arranged into an image-basedsignal of 20×20 size before being fed into any neural network in DA-DCGAN. 
5.3. Proposed DA-DCGAN 
5.3.1. Generative Adversarial Networks 
Generative adversarial network (GAN) was initially proposed by I. Goodfellow et 
al. in 2014 [176], which provides an alternative solution to maximum likelihood 
estimations. An illustration of a general framework of GAN is shown in Figure 5.2. 
GANs are deep neural nets comprising two parts: a generator, G, and a discriminator, D. 
G generates synthetic samples and tries to fool D, while D tries to distinguish these 
generated samples from the real ones. Therefore, G and D compete in a two-player 
minimax game as expressed in (5.1): 
min
𝐺
max
𝐷
𝐿𝐺𝐴𝑁(𝐷, 𝐺) = 𝔼𝓍~𝑃𝑟𝑒𝑎𝑙(𝓍)[log 𝐷(𝓍)] + 𝔼𝓏~𝑃𝒵(𝓏)[log(1 − 𝐷(𝐺(𝓏)))] (5.1) 
where 𝔼(∙) denotes the expectation of the input ∙; 𝓍 and 𝓏 are the data sampled from 
real data distribution 𝑃𝑟𝑒𝑎𝑙(𝓍) and noise distribution 𝑃𝒵(𝓏) , respectively; D is the 
output of the discriminator, which is the probability that came from the real data 𝑥 
instead of from the generated data. 
For a perfect D, 𝐷(𝓍) should be 1 when a real sample is input, and 𝐷(𝓍) should be 
0 when a fake sample is input. However, original GANs sometimes can generate noisy 
 
133 
and incomprehensible fake samples. Due to this, GANs are known to be difficult to 
train. Therefore, to enhance the performance and stabilise the training process of GANs, 
a class of CNNs called DCGAN is introduced to mitigate the existing problems [177]. 
Generator 
(G)
Discriminator
(D)
Real
or
Fake?
Real samples
Fake samples
Latent space
Back-propagation
 
Figure 5.2 General framework of a generative adversarial network 
5.3.2. Optimisation Procedures and Deep Learning Model Structures 
The process of the proposed DA-DCGAN methodology is visualised in Figure 5.3, 
and it comprises two stages. 
 
Figure 5.3 Overview of DA-DCGAN for DC series arc fault diagnosis in PV systems 
 
134 
Let 𝓍𝑠,𝑎 and 𝓍𝑠,𝑛 be samples of arcing and normal category from the source-
domain data distributions 𝑃𝑠; 𝓍𝑡,𝑎 and 𝓍𝑡,𝑛 be samples of arcing and normal category 
from the target-domain data distributions 𝑃𝑡 ; 𝑦𝑎 = 1 and 𝑦𝑛 = 0 be labels for arcing 
and normal data from both domains, respectively. 
At the first stage, the DCGAN is trained through an adversarial training process, 
where D is trained to distinguish real arcing data 𝓍𝑠,𝑎 from artificially generated arcing 
data 𝐺(𝓍𝑠,𝑛), and G is trained to transform normal data 𝓍𝑠,𝑛 into high-quality fake 
arcing data 𝐺(𝓍𝑠,𝑛) to fool D. Therefore, the optimisation objectives can be defined as 
follows: the network D tries to minimise the classification loss 𝐿𝐷 between fake arcing 
𝐺(𝓍𝑠,𝑛) and real arcing 𝓍𝑠,𝑎 using the source-domain dataset: 
𝐿𝐷 = −𝔼𝓍𝑠,𝑎~𝑃𝑠(𝓍𝑠,𝑎)[log 𝐷(𝓍𝑠,𝑎)]
− 𝔼𝓍𝑠,𝑛~𝑃𝑠(𝓍𝑠,𝑛)[log(1 − 𝐷(𝐺(𝓍𝑠,𝑛)))] (5.2) 
while the transformative network G tries to minimise the following loss 𝐿𝐺: 
𝐿𝐺 = 𝔼𝓍𝑠,𝑛~𝑃𝑠(𝓍𝑠,𝑛)[log(1 − 𝐷(𝐺(𝓍𝑠,𝑛)))] (5.3) 
In the second stage, a lightweight CNN with the optimal CNN structure as designed 
in Chapter 4 is trained to minimise the classification error using labelled source-domain 
dataset 𝒟𝑠(𝓍𝑠, 𝑦) = { {(𝓍𝑖
𝑠,𝑎, 𝑦𝑖
𝑎)}𝑖=1
𝑁𝑠,𝑎 , {(𝓍𝑖
𝑠,𝑛, 𝑦𝑖
𝑛)}𝑖=1
𝑁𝑠,𝑛} of 𝑁𝑠,𝑎 arcing samples and 𝑁𝑠,𝑛 
normal samples, and target-domain dataset 𝒟𝑡(𝓍𝑡, 𝑦) =
{ {(𝐺(𝓍𝑖
𝑡,𝑛), 𝑦𝑖
𝑎)}𝑖=1
𝑁𝑡,𝑛 , {(𝓍𝑖
𝑡,𝑛, 𝑦𝑖
𝑛)}𝑖=1
𝑁𝑡,𝑛} of 𝑁𝑡,𝑛 normal samples and 𝑁𝑡,𝑛 dummy arcing 
samples. The binary categorical cross-entropy loss 𝐿𝐵𝐶𝐸 based on a batch of data with 
size of m can be defined as follows: 
 
135 
𝐿𝐵𝐶𝐸 = −
1
𝑚
∑ 𝑦𝑖 log
1
1 + ℯ−((𝑤𝑙𝑓)𝑇𝑓(𝓍𝑖)+𝑏𝑙𝑓)
𝑚
𝑖=1
+ (1 − 𝑦𝑖) log(1 −
1
1 + ℯ−((𝑤𝑙𝑓)𝑇𝑓(𝓍𝑖)+𝑏𝑙𝑓)
) 
(5.4) 
where 𝑦𝑖 is the ground truth category label of 𝑖𝑡ℎ sample (arcing for 1 and normal for 
0); 𝑓(∙) is the high-level feature in the lightweight CNN classifier before classification 
layer; 𝑤𝑙𝑓 and 𝑏𝑙𝑓 denote the weight coefficient matrix and bias coefficient of the last 
fully-connected layer. Since the convolution layer, pooling layer, and fully-connected 
layers in the lightweight CNN classifier work together as a high-level feature extractor, 
it is responsible for extracting useful features from the input signal. Maximum mean 
discrepancy (MMD) is employed to assist the optimisation process in the lightweight 
CNN classifier to enable domain invariant feature learning in order to achieve inter-
domain condition classification. MMD is a metric to estimate the distribution 
discrepancy distance between data drawn from different distributions [178]. The square-
MMD loss in DA-DCGAN can be expressed as follows: 
𝐿𝑀𝑀𝐷 = ‖
1
𝑁𝑠,𝑎
∑ 𝑓(𝓍𝑖
𝑠,𝑎) −
𝑁𝑠,𝑎
𝑖=1
1
𝑁𝑡,𝑎
∑ 𝑓(𝓍𝑗
𝑡,𝑎)
𝑁𝑡,𝑎
𝑗=1
‖
ℋ
2
+ ‖
1
𝑁𝑠,𝑛
∑ 𝑓(𝓍𝑖
𝑠,𝑛) −
𝑁𝑠,𝑛
𝑖=1
1
𝑁𝑡,𝑛
∑ 𝑓(𝓍𝑗
𝑡,𝑛)
𝑁𝑡,𝑛
𝑗=1
‖
ℋ
2
 
(5.5) 
where ‖∙‖ℋ is any reproducing kernel Hilbert space. To reduce 𝐿𝑀𝑀𝐷 , high-level 
features from different domains can be closer in ℋ . Therefore, combining both 
optimisation objects of the second stage, the overall optimisation objective for 
lightweight CNN classifier is to minimise (5.6): 
𝐿𝐶 = 𝐿𝐵𝐶𝐸 + 𝜆𝐿𝑀𝑀𝐷 (5.6) 
 
136 
where 𝜆 is the punishment coefficient for 𝐿𝑀𝑀𝐷 term to control how strong the domain 
adaptation is. The detailed architectures and parameters of different neural networks in 
DA-DCGAN are listed in Table 5.2. In the Table, Conv. Denotes the two-dimensional 
convolution layer, and Conv. (UpSam) denotes the up-sampling two-dimensional 
convolution layer. For the leaky ReLU activation function used in DA-DCGAN, 
parameter α is set to 0.2 based on the recommendations in [177]. 
 
 
Table 5.2 The architecture of different neural networks in DA-DCGAN 
N
et
w
o
rk
 
N
o
. 
L
ay
er
 t
y
p
e 
K
er
n
el
 s
iz
e 
N
o
. 
o
f 
K
er
n
el
 
S
tr
id
e 
P
ad
d
in
g
 
B
N
 
A
ct
iv
at
io
n
 
D
ro
p
o
u
t 
O
u
tp
u
t 
G 1 Dense 3200 1 - - Yes ReLU No 3200 
 2 Conv. (UpSam) 3×3 128 2×2 Yes Yes ReLU No 10×10×128 
 3 Conv. (UpSam) 3×3 64 2×2 Yes Yes ReLU No 20×20×64 
 4 Conv. 5×5 1 1 Yes No Tanh No 20×20×1 
D 1 Conv. 5×5 32 1 No No Leaky ReLU 0.6 16×16×32 
 2 Conv. 3×3 64 1 No Yes Leaky ReLU 0.6 14×14×64 
 3 Conv. 3×3 128 1 No Yes Leaky ReLU 0.6 12×12×128 
 4 Conv. 3×3 256 1 No Yes Leaky ReLU 0.6 10×10×256 
 5 Dense 1 1 - - No Sigmoid No 1 
CNN 1 Conv. 5×5 3 1 No Yes ReLU No 16×16×3 
 2 Maxpooling 2×2 1 2×2 No No - No 8×8×3 
 3 Dense 8 1 - - Yes ReLU No 8 
 4 Dense 5 1 - - Yes ReLU No 5 
 5 Dense 1 1 - - No Sigmoid No 1 
 
 
 
 
 
137 
Furthermore, the overall detailed optimisation procedure as well as value of 
different key parameters of DA-DCGAN are presented in Algorithm 5.1 
Algorithm 5.1: Proposed DA-DCGAN 
Parameters: N𝑒𝑝𝑜𝑐ℎ,𝐷𝐶𝐺𝐴𝑁 = 8000, N𝑘 = 1, 𝑚 = 32, 𝑙𝑟 = 0.0001, 𝛽1 = 0.5, 𝛽1 = 0.9 , 
𝑁𝑒𝑝𝑜𝑐ℎ,𝐶 =
4𝑁𝑠,𝑛
2𝑘
× 20, 𝑁𝑠,𝑛 = 𝑁𝑠,𝑎 = 𝑁𝑡,𝑛 = 20000, 𝑘 = 32 
Step 1: DCGAN training using source domain data; 
for num_epoch = 0, , 𝑁𝑒𝑝𝑜𝑐ℎ,𝐷𝐶𝐺𝐴𝑁 do 
for num_train = 1, , 𝑁𝑘 do 
Sample a batch of arcing data {𝓍𝑖
𝑠,𝑎}𝑖=1
𝑚 and a batch of normal data {𝓍𝑖
𝑠,𝑛}𝑖=1
𝑚 
for i = 1, , m do 
𝐿𝐷,𝑖 ⟵ −log 𝐷(𝓍𝑖
𝑠,𝑎) − log(1 − 𝐷(𝐺(𝓍𝑖
𝑠,𝑛))) 
end for 
Update D by descending its gradient using Adam: 
θ𝐷 ⟵ Adam(∇θ𝐷
1
𝑚
∑ 𝐿𝐷,𝑖
𝑚
𝑖=1
, 𝑙𝑟, 𝛽1, 𝛽2) 
end for 
Sample a batch of normal data {𝓍𝑖
𝑠,𝑛}𝑖=1
𝑚 
for i = 1, , m do 
𝐿𝐺,𝑖 ⟵ log(1 − 𝐷(𝐺(𝓍𝑖
𝑠,𝑛))) 
end for 
Update G by descending its gradient using Adam: 
θ𝐺 ⟵ Adam(θ𝐺 , ∇θ𝐺
1
𝑚
∑ 𝐿𝐺,𝑖
𝑚
𝑖=1
, 𝑙𝑟, 𝛽1, 𝛽2) 
end for 
for i = 0, , 𝑁𝑡,𝑛 do 
𝓍𝑖
𝑡,𝑎 = 𝐺(𝓍𝑖
𝑡,𝑛) 
end for 
Step 2: Training thelightweight CNN classifier using source-domain data and target-
domain normal data; 
for num_epoch = 0, , 𝑁𝑒𝑝𝑜𝑐ℎ,𝐶 do 
Sample {(𝓍𝑗
𝑠, 𝑦𝑗)}𝑗=1
2𝑘 ={{(𝓍𝑖
𝑠,𝑎, 𝑦𝑖
𝑎)}𝑖=1
𝑘 ; {(𝓍𝑖
𝑠,𝑛, 𝑦𝑖
𝑛)}𝑖=𝑘+1
2𝑘 } from 𝒟𝑠(𝓍𝑠, 𝑦); 
Sample {(𝓍𝑗
𝑡, 𝑦𝑗)}𝑗=1
2𝑘 ={{(𝓍𝑖
𝑡,𝑎, 𝑦𝑖
𝑎)}𝑖=2𝑘+1
3𝑘 ; {(𝓍𝑖
𝑡,𝑛, 𝑦𝑖
𝑛)}𝑖=3𝑘+1
4𝑘 } from 𝒟𝑡(𝓍𝑡, 𝑦); 
for i = 1, , 4k do 
Calculate 𝐿𝐵𝐶𝐸,𝑖 using (5.4) 
end for 
for j = 1, , 2k do 
Calculate 𝐿𝑀𝑀𝐷,𝑗(𝓍𝑗
𝑠, 𝓍𝑗
𝑡) using (5.5) 
end for 
Update C by descending its gradient using Adam: 
θ𝐶 ⟵ Adam(θ𝐶 , ∇θ𝐶
(
1
4𝑘
∑ 𝐿𝐵𝐶𝐸,𝑖
4𝑘
𝑖=1
+
𝜆
𝑘
∑ 𝐿𝑀𝑀𝐷,𝑗
2𝑘
𝑗=1
), 𝑙𝑟, 𝛽1, 𝛽2) 
end for 
 
138 
5.4. Case Study 1: Offline Validation Results 
To demonstrate the superior performance of the proposed DA-DCGAN 
methodology, several methods are evaluated for comparison as follows: 
1) Only the lightweight CNN with the same structure trained using source-domain 
data; 
2) Only the lightweight CNN with the same structure trained using source-domain 
data and target-domain normal data; 
3) Transfer component analysis with a SVM classifier trained using source-
domain data and target-domain normal data [179]. The input data is flattened; 
4) Deep neural network for domain adaptation in fault diagnosis trained using 
source-domain data and target-domain normal data. SVM is employed as the 
classifier as suggested in [180]. The input data is flattened; 
5) Proposed DA-DCGAN without MMD punishment term trained using the 
source-domain data, normal data and dummy arcing data from the target-
domain. 
The results using different methods on target-domain testing dataset are presented 
in Table 5.3. DA-DCGAN demonstrates excellent performance compared to other 
methods. The training detection accuracy of arcing and normal for DA-DCGAN is 
98.80% and 99.56%, respectively. On testing dataset, DA-DCGAN dramatically 
improves the arcing recognition accuracy (97.68%) while the classification accuracy of 
normal state approximately remains unchanged (99.32%) in target-domain series arc 
fault detection task as compared to methods 1)-5). Besides using classification accuracy 
of arcing and normal category, two other important metrics, named sensibility and 
safety, are used to evaluate the performance of the different methods. Sensibility is an 
 
139 
indicator to measure the system sensitivity related to normal operating conditions and 
normal transient events (the rate to avoid false alarm). Safety is an indicator to measure 
the system sensitivity to DC series arc fault (the rate to avoid missing alarm). These two 
metrics can be calculated through (4.24) and (4.25) as presented in Section 4.5.2. 
 
Table 5.3 Testing accuracy comparison for target domain series arc fault detection 
Method Arcing Normal Sensibility Safety 
Overall 
accuracy 
Computation 
complexity 
1) 99.98% 18.42% 55.07% 99.89% 59.20% 𝒪 (𝑁2) 
2) 76.56% 99.48% 99.33% 80.93% 88.02% 𝒪 (𝑁2) 
3) 70.32% 98.92% 98.49% 76.92% 84.62% 𝒪 (𝑁2) / 𝒪 (𝑁3) 
4) 80.36% 99.22% 99.04% 83.48% 89.79% 𝒪 (𝑁2) / 𝒪 (𝑁3) 
5) 92.12% 99.40% 99.35% 92.65% 95.76% 𝒪 (𝑁2) 
Proposed 97.68% 99.32% 99.31% 97.72% 98.50% 𝓞 (𝑵𝟐) 
 
 
The interactions between the solar inverter and the PV emulator introduce dramatic 
variation into the current signal (DC side current is around 7A for both conditions) as 
shown in Figure 5.4. Therefore, the fluctuation pattern and the magnitude of the signals 
captured from the laboratory and real PV systems are significantly distinctive from each 
other. This would degrade the performance of the detection algorithm if the classifier is 
trained based solely on the source-domain data. As shown in method 1), Table 5.3, the 
testing accuracy of normal state and the sensibility on target-domain dataset are only 
18.42% and 55.07%, respectively, which could cause frequent false-tripping and thus 
not viable for real-world deployment. Next, the target-domain normal data is included 
as described in method 2). The testing accuracy of normal state and sensibility increases 
at the expense of decrease in testing accuracy of arcing state and safety. The overall 
performance is still not sufficient to be considered a reliable scheme. 
 
140 
DC current level = 7 A
Zoomed CT signal
 
Figure 5.4 Healthy signal capture by CT from source domain and target domain 
To demonstrate the impacts of the MMD term on distribution discrepancy of high-
level features from different domains visually, the t-distributed stochastic neighbour 
embedding (t-SNE) algorithm is adopted to transform high-level features into a 2D 
space [181], [182]. When using this algorithm, 700 samples are randomly chosen from 
each category (3500 samples in total). As shown in Figure 5.5 (a), there is some 
overlapping of features in the middle that corresponds to different categories when 
using the DA-DCGAN without MMD. It results in lower accuracy since it is not 
possible to derive a clear separation line between normal and arcing for classification. 
On the other hand, in Figure 5.5 (b), with the MMD punishment term, the distribution 
discrepancy between the target-domain and source-domain features is minimised, which 
helps the lightweight CNN to learn more domain-invariant/sharing features for 
classification task with higher accuracy. As a result, the target-domain arcing 
classification accuracy improves significantly from 92.12% to 97.68%. 
 
141 
(a) (b) 
Figure 5.5 Visualisation of high-level features in the lightweight CNN before 
classification layer using t-SNE method: (a) DA-DCGAN without MMD; (b) proposed 
DA-DCGAN. 
The comparisons of computation complexity between different methods are shown 
in Table 5.3. SVM computation complexity can be 𝒪 (𝑁2) or 𝒪 (𝑁3) depending on the 
kernel size [183]. Here, the computation complexity will be 𝒪 (𝑁2) for SVM in 
method 3) and method 4). It is worthwhile to point out that very deep networks such as 
GANs are only used in offline to generate fake arcing signal and a lightweight CNN is 
used for online embedded application which have the computation complexity of 
𝒪 (𝑁2) [184]. On the other hand, SVM works properly with a feature extractor as 
shown in method 4) [180]. The feature extractor used in this case study for method 4) is 
a SAE consisting of three fully connected layers. Therefore, the resultant computation 
effort is similar to the proposed approach. However, the proposed approach could 
achieve better outcomes. 
 
142 
The proposed method can achieve higher accuracy by varying the structures of the 
lightweight CNN. As shown in Figure 5.6, the overall accuracy can improve from 98.5% 
to 99.4% by increasing the number of kernels in the convolution layer of the lightweight 
CNN from 3 to 8. This is often the case for CNN-based classifier because the kernels in 
the convolution layer are responsible for extracting discriminative features for 
classification. Thus, having a greater number of kernels can increase the prospect of 
finding the optimal feature set but at the expense of increasing computation load. 
Number of kernels (number of filters)
2 3 5 8
Sensibility
 
Figure 5.6 Impact of kernel numbers in the convolution layer of the lightweight CNN 
on the performance of target-domain DC series arc fault diagnosis 
The effect of the sampling frequency is also investigated. The original dataset is 
down sampled to 5 kHz and 40 kHz, respectively, and fed into the DA-DCGAN 
following the same procedures. For the 5 kHz dataset (the size of each input sample is 
10 × 10 ), the accuracy of arcing and normal reduces to 86.20% and 96.08%, 
 
143 
respectively. This is normally the case due to lack of information that can be provided 
by a 5 kHz-signal as compared to a 20 kHz-signal. Furthermore, the detection accuracy 
of arcing and normal increases slightlyto 98.12% and 99.62% for 40 kHz scenario as 
expected (the size of each input sample is 28 × 28). It can be seen that there is no 
further significant improvement compared to 20 kHz scenario. Therefore, 20 kHz 
sampling frequency (the size of each input sample is 20 × 20) is a relatively good 
choice to demonstrate the proposed methodology, and it can be modified according to 
the implementation requirements such as the capability of the microcontroller used. 
5.5. Case Study 2: Real-time Implementation and Validation Results 
Similar to real-time tests as described in Section 4.5.4, for real-time 
implementation, a final decision operator is applied to strike a balance between 
reliability, accuracy, and response speed. The flow chart of the overall real-time series 
arc fault detection has been illustrated in Figure 4.14. m=3 and k=10 is used, and no 
wrong decisions are witnessed based on the tests using pre-recorded time-series data 
without shuffling. 
The lightweight CNN classifier is then implemented in a prototype based on an NI-
CompactRIO-9030 embedded controller and validated under different experimental 
conditions in real-time. Some examples of the results are presented in the rest of this 
section. 
The oscilloscope display shows four different signals: 
• Voltage at the DC side of PV inverter – the PV voltage captured by the voltage 
probe with ratio of 200:1 (yellow top trace, CH1); 
• Output digital signal of the final decision from the monitoring unit (green bottom 
trace, CH2); 
 
144 
• Arc voltage signal captured by the voltage probe with ratio of 200:1 (blue trace, 
CH3); 
• Loop current (arc current) captured by the current probe with ratio of 10:1 (pink 
trace, CH4). 
In Figure 5.7, two large current spikes can be seen before the inverter exporting 
power to the main grid because of DC disconnect closing and initialisation operation. 
After approximately 40 seconds, the inverter starts to deliver power and perform MPPT. 
No unwanted tripping is experienced during the inverter start-up period. In Figure 5.8, 
a series arc fault (about 220 W) is initiated at the start of the PV string shortly after a 
fast-moving cloud. The fast-moving cloud introduces about 50% current drop in 
approximately 5 s. In Figure 5.9, a sudden large current drop is observed caused by a 
fast shading disturbance during the experiment (possibly caused by a flying bird). The 
proposed algorithm can handle these step changes induced by changes of irradiance and 
detect series arcing at 48 ms, which leaves a significant margin compared to the 
required response time, 𝑇𝑅 (with an absolute limit of 2.5 s) as calculated in (2.1), listed 
in UL-1699B Standard. 
In Figure 5.10, series arc fault detection (about 80 W) at low-irradiance level is 
performed, and the proposed algorithm responds to the event at about 60 ms. Multiple 
intermittent series arc faults (170-200 W) diagnosis scenario is also presented as shown 
in Figure 5.11, and the response time is fluctuating around 60 ms. Figure 5.12 shows an 
experimental result that the proposed algorithm is able to accurately detect series arc 
fault (210 W) generated at the middle of the PV string within 60 ms. 
Real-time arc fault detection tests are repeated three times under applicable cases 
used in UL-1699B [15]. The separation rate and distance of the arc generator is set to: (i) 
2.5 mm/s and 0.8 mm for arcing test at low irradiance level (i.e. below intermediate 
 
145 
current levels); (ii) 5 mm/s and 0.8 mm & 2.5 mm for arcing test at high irradiance level 
(e.g. near maximum allowable DC current). The arc power is ranging from about 80 W 
to 350 W. The proposed diagnosis scheme achieves accurate decisions in all these 
aforementioned experimental conditions and detects series arc event at around 60 ms, 
meeting the required detection time with a significant margin. In addition, unwanted 
tripping tests are carried out under three different loading conditions: single-phase 
inverter, DC switch operation, and irradiance step changes. No false tripping is 
encountered throughout the experiment. 
 
Close DC disconnect
Inrush current during the initialisation stage
MPPT
No response
 
Figure 5.7 Response to DC disconnect switch closing, inrush current during 
initialisation of inverter, start-up, and MPPT operation 
 
146 
Fast moving cloud
No response
Series arc fault
Detected rapidly
(a)
(b) 
Figure 5.8 Response to fast moving cloud and a series arc fault at high irradiance level 
(10A, full loading): (a) 5s per division; (b) 200ms per division (zoomed) 
 
147 
Fast disturbance
No response
 
Figure 5.9 Response to a fast shading disturbance 
Series arc fault
at low irradiance level
Detected rapidly
 
Figure 5.10 Response to a series arc fault at low irradiance level in a cloudy day 
 
148 
Several intermittent 
series arc faults
Sustained series arc fault
Arc extinguish
Detected rapidly
(a)
(b) 
Figure 5.11 Response to several intermittent series arc faults followed by a sustained 
arc fault: (a) 2s per division; (b) 200ms per division (zoomed) 
 
149 
Series arc fault at 
the middle of array
Detected rapidly
Series arc fault at 
the middle of array
Detected rapidly
Arc extinguish
(a)
(b) 
Figure 5.12 Response to a series arc fault generated at middle of the PV string on a 
sunny day: (a) 1s per division; (b) 100ms per division (zoomed) 
 
 
 
150 
5.6. Conclusion 
This chapter presents a DL-based methodology, DA-DCGAN, for practical 
domain-shifting series arc fault detection in PV systems without using target-domain 
fault data during training. It provides an effective solution to address the challenges 
when applied DL for practical series arc fault detection as discussed in Section 4.6.1, 
Section 4.6.2, and Section 4.6.4. 
Tests on pre-recorded PV loop current dataset and real-time experiments are carried 
out to validate the effectiveness and robustness of the proposed methodology. Without 
relying on fault data from real PV systems, which is aligned with practical situations, 
the proposed method is able to achieve high detection accuracy without performance 
degradation from domain switching. 
 
Some of the work described in this chapter has been published in: 
Shibo Lu, Tharmakulasingam Sirojan, B. T. Phung, Daming Zhang, and Eliathamby 
Ambikairajah, “DA-DCGAN: An Effective Methodology for DC Series Arc Fault 
Diagnosis in Photovoltaic systems,” IEEE Access, vol. 7, pp. 45831-45840, April 2019. 
 
 
151 
6. Intelligent DC Series Arc Fault Detection in PV Systems 
using LTCNN-ADA with Limited Target-Domain Fault 
Data 
6.1. Introduction 
Chapter 5 has addressed the domain-shifting problem under the extreme condition 
that no fault data from the target domain is available. Sometimes, this is not necessarily 
the case. For example, Sandia National Laboratories and many manufacturers do put a 
lot of effort into collecting target-domain fault data in the field. These data collection 
efforts may be somewhat inadequate but still of significant value. Therefore, it is also 
important to develop strategies to optimise the performance of DL-based algorithms 
with limited amount of target-domain fault data. 
This chapter proposes a new framework, LTCNN-ADA, which aims to improve the 
performance of intelligent DC series arc fault detection with limit amount of target-
domain fault data using transfer learning. Most of existing approaches use deep transfer 
learning, employing very deep neural networks to enable complex cross-domain feature 
learning and knowledge transfer [161], [171], [185], [186]. For example, Shao et al. in 
[184] applied deep transfer learning in machine fault diagnosis using the well-known 
VGG16 [156]. However, the major drawback of these approaches is that it is difficult to 
cost-effectively deploy the trained deep models in real-time edge devices.In the 
proposed LTCNN-ADA, a lightweight transfer learning-based strategy using 
lightweight transfer network is adopted. To further boost the diagnosis performance and 
stabilise the knowledge transfer process from source domain to target domain, ADA, 
which can augment the fault dataset through adversarial learning, is also applied. 
 
152 
The proposed LTCNN-ADA framework is formulated under the following 
conditions: 
1. Sufficient normal data and arcing data can be obtained in the source domain (e.g. 
laboratory setup based on PV emulator), while only sufficient normal data and 
limited fault data can be obtained in the target domain for training (real grid-
connected PV systems). 
2. The source-domain data and target-domain data have different distributions 
because the system parameters, characteristics, and working conditions are 
different. 
6.2. Experimental Setup 
6.2.1. Experimental Setup and Conditions in Source Domain 
Source-domain experiments are performed using a programmable DC power 
supply, an arc generated designed according to UL-1699B (2018) [7], a 1.5-kW single-
phase solar inverter, and a 5-kW three-phase solar inverter. The DC power supply is 
programmed using PV Power Profile Emulation software to simulate PV systems with 
different combination of temperature, irradiance levels, and system voltage and current 
levels (at STC) to interface with single-phase or three-phase inverters. Then, DC series 
arc faults are generated at a separation speed of 5 mm/s and a gap distance of 0.5 mm at 
different experimental conditions. Detailed experimental conditions and setup in the 
source domain are given in Table 6.1 and Figure 6.1. Finally, Dataset A and Dataset B 
are prepared. Each dataset consists of 20,000 arcing samples and 20,000 normal 
samples, and each sample corresponds to 400 points (20 ms duration under 20-kHz 
sampling rate). 
 
153 
 
Figure 6.1 Experimental setup in: (a) source domain; (b) target domain 
6.2.2. Experimental Setup and Conditions in Target Domain 
Target-domain experiments are performed using four to twelve JINKO JKM350M-
72 mono-crystalline solar panels, more than 82-m of 6 mm2 regular solar cables, 
together with the above-mentioned components except the DC power supply as shown 
in Figure 6.1. DC series arc fault experiments are designed and carried out based on the 
 
154 
recommendations from UL-1699B (2018) at different current levels, separation rates, 
gap distances, and arc fault locations (at the start, middle, and end of the PV array) as 
presented in Table 6.1. According to the above configurations, Dataset C (16,000 
normal samples and 10,500 arcing samples) and Dataset D (16,000 normal samples and 
10,500 arcing samples) are prepared under 20-kHz sampling rate. 
Table 6.1 Description of datasets for LTCNN-ADA case study 
Dataset 
index 
Domain Inverter Power supply 
System voltage and 
current (STC) 
A Source 
SMA Sunny boy single 
phase inverter (1.5kW) 
Magna PV 
emulator 
𝑉𝑂𝐶: 190𝑉 
𝐼𝑆𝐶: 9.38𝐴 
B Source 
SMA Sunny Tripower 
three phase inverter (5kW) 
Magna PV 
emulator 
𝑉𝑂𝐶: 380 − 665𝑉 
𝐼𝑆𝐶: 9.38𝐴 
C Target 
SMA Sunny boy single 
phase inverter (1.5kW) 
JKM350M-72 PV 
Panels 
𝑉𝑂𝐶: 190𝑉 
𝐼𝑆𝐶: 9.38𝐴 
D Target 
SMA Sunny Tripower 
three phase inverter (5kW) 
JKM350M-72 PV 
Panels 
𝑉𝑂𝐶: 238 − 570𝑉 
𝐼𝑆𝐶: 9.38𝐴 
Dataset 
index 
Domain 
Temperature and 
irradiance level 
Separation distance 
& separation rate 
Number of extracted 
real data 
A Source 
𝑇: 0 − 45 ℃ 
𝐼𝑟: 400 − 1000 𝑊/𝑚2 
0.5 mm 
5 mm/s 
N:20,000 
A:20,000 
B Source 
𝑇: 0 − 45 ℃ 
𝐼𝑟: 400 − 1000 𝑊/𝑚2 
0.5 mm 
5 mm/s 
N:20,000 
A:20,000 
C Target 
Sunny/Cloudy 
Morning/Afternoon 
0.8-2.5 mm 
2.5-5 mm/s 
N:16,000 
A:10,500 
D Target 
Sunny/Cloudy 
Morning/Afternoon 
0.8-2.5 mm 
2.5-5 mm/s 
N:16,000 
A:10,500 
 
Similar to the case study in previous chapters, every time-sequence sample is pre-
processed using [-1, 1] minmax normalisation and arranged into a 2D sample of 20×20 
size before being fed into any algorithms in this chapter. 
6.3. Proposed LTCNN-ADA 
For ease of reference, the symbols that are frequently used in this chapter are 
summarised in Table. 6.2. 
 
155 
Table 6.2 Symbols and descriptions 
Symbol Description Symbol Description 
𝒟 Domain N Number 
x A sample X Sample matrix 
y Label of a sample Y Label vector 
s, t (sub/sup) Source, target n, a (sub/sup) Normal, arcing 
𝑃 Data distribution 𝐿 Loss function 
𝔼(∙) Expectation of the input W, b Weight matrix and bias 
𝛶 Label space 𝜒 Data space 
 
6.3.1. Transfer Learning 
Generally, the training data available must be adequate for ANN-based classifiers 
to achieve acceptable fault diagnosis performance. However, for in-service systems in 
real industry scenarios, only few labelled fault samples can be obtained since they are 
not allowed to operate continuously under fault conditions, making it time-consuming 
and difficult to obtain sufficient fault data. To overcome this problem, transfer learning 
techniques have been introduced [161], [171]. Transfer learning allows ANN to make 
use of the prior knowledge from the source domain, where training samples are 
sufficient, then adapt the learned knowledge to assist the training process in the target 
domain with limited data. 
Given a source domain 𝒟𝑠 = {𝜒𝑠, 𝑃(𝑋𝑠)} and a target domain 𝒟𝑡 = {𝜒𝑡, 𝑃(𝑋𝑡)}, 
where 𝜒𝑠 and 𝜒𝑡 represent data spaces from the source and target domains, and 𝑃(𝑋𝑠) 
and 𝑃(𝑋𝑡) represent the marginal probability distributions from the source and target 
domains, 𝑋𝑠 ∈ 𝜒𝑠 , and 𝑋𝑡 ∈ 𝜒𝑡 . In this chapter, 𝑋𝑠 = {𝑥𝑖
𝑠,𝑎, 𝑥𝑗
𝑠,𝑛 }𝑖,𝑗
𝑁𝑠,𝑎,𝑁𝑠,𝑛 and 𝑋𝑡 =
{𝑥𝑖
𝑡,𝑎, 𝑥𝑗
𝑡,𝑛 }𝑖,𝑗
𝑁𝑡,𝑎,𝑁𝑡,𝑛 denote the source and target domain data, and the corresponding 
labels are 𝑌𝑠 = {𝑦𝑖
𝑠,𝑎, 𝑦𝑗
𝑠,𝑛 }𝑖,𝑗
𝑁𝑠,𝑎,𝑁𝑠,𝑛 and 𝑌𝑡 = {𝑦𝑖
𝑡,𝑎, 𝑦𝑗
𝑡,𝑛 }𝑖,𝑗
𝑁𝑡,𝑎,𝑁𝑡,𝑛 , respectively. In real 
scenarios, generally the number of arcing fault samples in target domain 𝑁𝑡,𝑎 is limited 
 
156 
(𝑁𝑡,𝑎 ≪ 𝑁𝑠,𝑎). Given a learning task 𝒯 = {𝛶, 𝑃(𝑌|𝑋)}, where 𝛶 is the label space, 𝑌 ∈
𝛶, and 𝑃(𝑌|𝑋) is the conditional probability distribution (prediction function), the main 
objective of 𝒯 is to maximise the classification performance. When 𝒟𝑠 ≠ 𝒟𝑡, transfer 
learning aims to accomplish 𝒯𝑡 in 𝒟𝑡 by leveraging the knowledge in 𝒟𝑠 and 𝒯𝑠. Such a 
learning task in target-domain achieved by transfer learning can be denoted as 𝒯𝑡: 𝒟𝑠 →
𝒟𝑡 . Figure 6.2 gives a clearer illustration of difference between traditional machine 
learning methods and transfer learning-based methods. 
Source domain Target domain
Traditional machine learning
Source domain Target domain
Enhanced by transfer learning
Misclassify
Knowledge transferKnown samples Unknown samples
 
Figure 6.2 Illustrations of traditional ML methods and ML methods enhanced by 
transfer learning 
6.3.2. Wasserstein Generative Adversarial Networks 
GANs are known to be difficult to train, and suffer from many problems such as 
mode collapse, non-convergence, diminished gradient [187] even for DCGAN. The 
Wasserstein generative adversarial with gradient penalty (WGAN-GP) was proposed to 
address those problems. WGAN-GP replaces the Jensen-Shannon divergence by the 
Wasserstein distance in the final objective function, and a gradient penalty term is 
 
157 
introduced to help the gradients flow back to the generator. The loss function of 
WGAN-GP is written as: 
min
𝐺
max
𝐷
𝐿𝐺𝐴𝑁(𝐷, 𝐺) = 𝔼𝑥[𝐷(𝑥)] + 𝔼𝓏[𝐷(𝐺(𝓏))]
+ 𝜂𝔼�̂�[(‖∇�̂�𝐷(�̂�)‖2 − 1)2] 
(6.1) 
where D stands for discriminator, and G stands for generator, Wasserstein distance is 
estimated by the first two terms, the last term denotes the gradient penalty with a 
weighting factor 𝜂, �̂�~𝑃�̂�(�̂�).𝑃�̂�(�̂�) is a sampling distribution uniformly sampled along 
straight lines between pairs of points sampled from 𝑃𝑟𝑒𝑎𝑙(𝑥) and generator sample 
distribution 𝑃𝐺(𝑥), and ‖∇�̂�𝐷(�̂�)‖2 is the gradient norm for random sample �̂�. 
6.3.3. Procedures of the Proposed LTCNN-ADA 
The experimental signals in the time domain and their frequency spectra are 
illustrated in Figure 6.3. As can be seen from the figure, signals have sufficient 
information for arc fault detection in frequencies below 10 kHz in both single phase and 
three phase PV systems. Therefore, it also demonstrates that using 20 kHz sampling rate 
is appropriate. 
Note that every time-sequence sample in both source domain and target domain is 
pre-processed using [-1, 1] minmax normalisation and arranged into an image-based 
signal of 20×20 size before being fed into any neural network in LTCNN-ADA. 
Given a target-domain dataset {𝑋𝑡, 𝑌𝑡} for training, including {𝑋𝑡,𝑎, 𝑌𝑡,𝑎} with 𝑁𝑡,𝑎 
arcing samples and { 𝑋𝑡,𝑛, 𝑌𝑡,𝑛 } with 𝑁𝑡,𝑛 normal samples, the main steps of the 
proposed LTCNN-ADA, as illustrated in Figure 6.4, comprise: 
 
 
158 
Without MinMax Normalisation With MinMax Normalisation
Inverter 
switch-on
Inverter 
initialisation
Instant of 
partial shading
Inverter normal 
operation 
Arcing fault
Inverter 
switch-on
Inverter 
initialisation
Instant of 
partial shading
Inverter normal 
operation 
Arcing fault
Without MinMax Normalisation With MinMax Normalisation
(a)
(b)
Initialisation
Initialisation
 
Figure 6.3 Examples of experimental signals (sampled at 200-kHz for analysis purpose) 
and their frequency spectra under normal and arcing conditions in (a) single-phase PV 
system; (b) three-phase PV system. 
 
159 
Step 1: Perform the ADA using {𝑋𝑡,𝑎, 𝑌𝑡,𝑎} and WGAN-GP according to (6.1) 
through an adversarial training process. For G, a fully-connected layer with 3,200 
neurons is used, followed by two up-sampling convolution layers with 128 and 64 
filters. A BN-ReLu layer is used after each abovementioned layer [138], [139]. The last 
layer of G is a convolution layer having one filter (5 × 5 in size) with a tanh activation 
function in order to produce data in the range of [-1,1]. Therefore, a latent vector fed to 
G can be firstly reshaped to a matrix with a size of 5 × 5 × 128 after the fully-
connected layer. Then, a synthesised sample with a desired size of 20 × 20 × 1 can be 
generated after being processed by the rest of layers. For D, four convolution layers 
with 16, 32, 64, and 128 filters are adopted and LeakyReLu (0.2) is used as activation 
function. Unless otherwise mentioned, the filter size is 3 × 3 for all convolution layers. 
The choices of the GAN architecture and hyperparameters are based on the 
recommendations for image synthesis tasks in [136], [177], [187]. In this study, based 
on [187], the learning rate is 𝛿 = 0.0001, batch size is 𝑁𝑏𝑎𝑡𝑐ℎ =64, gradient penalty 
factor is 𝜂 = 10, and number of epochs is 14,000 for training WGAN-GP. It is common 
practice to train a CNN with a balanced dataset, which is expected to lead to the best 
performance [188], [189]. Therefore, after WGAN-GP is converged, to make the target-
domain dataset balanced, the trained generator is used to create 𝑁𝑡,𝑔𝑎 generated arcing 
sample, where 𝑁𝑡,𝑛 = 𝑁𝑡,𝑎 + 𝑁𝑡,𝑔𝑎 . Then, { 𝑋𝑡,𝑔𝑎, 𝑌𝑡,𝑔𝑎 }, which consists of 𝑁𝑡,𝑔𝑎 
generated arcing samples and corresponding labels, are included in {𝑋𝑡, 𝑌𝑡} for Step 3. 
Step 2: Well-train a lightweight CNN classifier using {𝑋𝑠, 𝑌𝑠}, which consists of 
sufficient source-domain arcing and normal samples, with cross-entropy loss, 𝐿𝐶𝐸, and 
the popular SGD algorithm to initialise the LTCNN [136], as shown in (6.2)-(6.3): 
 
160 
𝐿𝐶𝐸 = −
1
𝑁
∑(𝑦𝑖 log
1
1 + ℯ−(𝑊𝑙𝑓
𝑇𝑓(𝓍𝑖)+𝑏𝑙𝑓)
𝑁
𝑖=1
+ (1 − 𝑦𝑖) log(1 −
1
1 + ℯ−(𝑊𝑙𝑓
𝑇𝑓(𝓍𝑖)+𝑏𝑙𝑓)
)) 
(6.2) 
𝜃𝐿𝑇𝐶𝑁𝑁 ⟵ 𝑆𝐺𝐷(𝜃𝐿𝑇𝐶𝑁𝑁 , 𝛻𝜃𝐿𝑇𝐶𝑁𝑁
(𝐿𝐶𝐸(𝑋𝑠, 𝑌𝑠)), 𝛿) (6.3) 
where the subscript lf denotes the last layer of the whole neural network, 𝜃 denotes the 
network parameters, 𝛻(∙) stands for the calculated gradients of input ∙, and 𝛿 is the 
learning rate. For LTCNN, a convolution layer with three 5 × 5 filters and a Max 
Pooling layer with 2 × 2 kernel size is used at first. Then, two fully-connected layers 
with eight and five neurons are connected before the classification layer. BN-ReLu 
layers are applied as well. It is the same CNN structure as designed in Chapter 4. 
Step 3: Fine-tune all the parameters of the pre-trained LTCNN obtained from Step 
2 with the augmented dataset {𝑋𝑡, 𝑌𝑡} obtained from Step 1, using loss 𝐿𝐿𝑇𝐶𝑁𝑁, based on 
(6.4)-(6.5): 
𝐿𝐿𝑇𝐶𝑁𝑁 = 𝐿𝐶𝐸 + 𝜆𝐿𝑟𝑒𝑔𝑢𝑙𝑎𝑟𝑖𝑧𝑒𝑟 (6.4) 
𝜃𝐿𝑇𝐶𝑁𝑁 ⟵ 𝑆𝐺𝐷(𝜃𝐿𝑇𝐶𝑁𝑁, 𝛻𝜃𝐿𝑇𝐶𝑁𝑁
(𝐿𝐿𝑇𝐶𝑁𝑁(𝑋𝑡, 𝑌𝑡)), 𝛿) (6.5) 
where 𝜆 is the regularisation factor. Note that {𝑋𝑠, 𝑌𝑠 } can be included during the 
training process in Step 3 if the regulariser in (6.4) is enabled. After being fine-tuned, 
the trained model can be deployed in real-time devices to achieve online DC series arc 
fault detection for the target domain. In Step 2 and Step 3 in this study, 𝛿 = 0.001, 
𝑁𝑏𝑎𝑡𝑐ℎ =64, 𝜆 = 0, the number of epochs for LTCNN pre-training in Step 2 is 
2𝑁𝑠,𝑛
𝑁𝑏𝑎𝑡𝑐ℎ
×
10, and the number of epochs in Step 3 is 
2𝑁𝑡,𝑛
𝑁𝑏𝑎𝑡𝑐ℎ
× 10. 
 
161 
Higher Level 
Feature Reasoning
Decision
Layer
CNN
Blocks
Feature 
Extractor
LCE,Source
LCE,Target
LRegularizer,Target
(Optional)
ƏLSource
ƏθHLFR
ƏLSource
ƏθFE
ƏLTarget
ƏθHLFR
ƏLTarget
ƏθFE
PV panels
Step 3:
Fine-Tuning using Target-
Domain Data and Generated 
Arcing Data from WGAN-GP
Step 2:
Initialisation using Source-Domain Data
Normal
Arcing
Normal
Arcing 
(limited)
Source Domain under Lab Environment
Data Feedforward
(Solid line)
Gradients back-propagation 
to update model parameters
(Dot line)
PV emulator
Generator
CriticLatent
Generated arcing
ƏLG
ƏθG
ƏLC
ƏθC
Step 1:
Dataset
Preparation
Target Domain under Realistic Environment
 
Figure 6.4 Overview of the proposed LTCNN-ADA framework 
6.4. Case Study 1: Offline Validation Results 
6.4.1. Analysis and Evaluation of Generated Arcing Data 
6.4.1.1. Training Loss Curves Analysis of ADA by WGAN-GP 
Unlike other machine learning models that can be evaluated by loss functions, it is 
difficult to evaluate the quality of a GAN generator based on its loss alone [190]. 
During the training of a GAN, the losses of generator and discriminator model should 
maintain an equilibrium. After they converge to an equilibrium, the generator model can 
be assessed by qualitative analysis (e.g. manual inspection or feature visualisation) and 
quantitative analysis (e.g. performance on targeted classification tasks). Several 
examples of training loss curves of WGAN-GP with 𝑁𝑡,𝑎 =10, 20, 30, 60, 150, and 300 
from target domain dataset C are shown in Figure 6.5. It can be observed that WGAN-
 
162 
GP with 𝑁𝑡,𝑎 =10 exhibits extreme instability and fails to converge. On the other hand, 
with more samples included during the training process, WGAN-GP can easily reach 
convergence. Therefore, to avoid the negative impacts from non-convergence problem 
[136], 𝑁𝑡,𝑎 ≥20 is recommended. 
Nt,a = 10 Nt,a = 20
Nt,a = 30 Nt,a = 60
Nt,a = 150 Nt,a = 300
 
Figure 6.5 Training loss curves of WGAN-GP with different 𝑁𝑡,𝑎 
6.4.1.2. High Dimensional Feature Visualisation 
To demonstrate the effectiveness of the ADA, the popular t-SNE is applied to 
visualise the high-dimensional features of the generated fault samples by mapping them 
from the feature space (output before the classification layer of the LTCNN classifier) 
into a low-dimensional space (2D in this study) [181], [182]. Different transfer tasks 
with 𝑁𝑡,𝑎=60 are studied for illustration purpose, and the results are shownin Figure 
6.6. The visualisation results indicate that the generated fault samples by ADA share 
similar features and distributions with the real arcing samples, including samples from 
the training dataset and unseen samples from the testing dataset. Therefore, ADA can 
 
163 
generate meaningful samples to assist the training process of the LTCNN. This 
observation is also confirmed by the quantitative analysis in Section 6.4.2. 
B C B D
A DA C
 
Figure 6.6 2D visualisation using t-SNE under different transfer tasks (𝑁𝑡,𝑎 = 60) 
6.4.1.3. Frequency Domain Analysis 
Typically, manual inspection of generated synthetic samples is a common 
technique for evaluating GANs. For example, in other applications such as hand-writing 
digits recognition, the generated samples are understandable to human [176]. However, 
in this research, we are dealing with signals which are less understandable merely 
through manual inspection. The 2D matrix of normal, real arcing, and generated arcing 
samples are visualised in Figure 6.7. 
 
164 
However, it is difficult to see the difference between those samples illustrated in 
Figure 6.7(a)-(c). Therefore, FFT is applied to evaluate the generated fault samples in 
the frequency domain. Those samples are firstly converted back to time series signals, 
and then FFT is performed to obtain the frequency spectrum. As can be seen in Figure 
6.7(d), the spectra of generated and real fault samples are similar. These results indicate 
that the ADA method is able to generate samples with key characteristics of real fault 
samples. 
(b) (c)
(d)
(a)
 
Figure 6.7 (a) examples of three phase normal signal; (b) examples of three phase 
arcing signal; (c) examples of generated three phase arcing signal; (d) frequency spectra 
of the normalised time-series signals; those signals are randomly selected from the 
training dataset D and generated arcing dataset by ADA in the transfer task of A→D 
(𝑁𝑡,𝑎 = 60) 
 
165 
6.4.2. Results with Different Number of Fault Samples 
In order to demonstrate the improvement and effectiveness of the proposed 
framework, different approaches are investigated. For simplicity, no regularisers are 
used during the training process in this chapter: 
1. CNN (Baseline): Only the lightweight CNN is trained by simply using the 
target-domain dataset, and no transfer learning or ADA are applied. It is the 
original method developed in Chapter 4. 
2. LTCNN without ADA: To demonstrate the effect of lightweight transfer 
learning strategy, the lightweight CNN is firstly well-trained using the source-
domain dataset. Then its parameters are fine-tuned using the target-domain 
dataset. No ADA is applied. 
3. LTCNN-ADA: The lightweight CNN is firstly well-trained using the source-
domain dataset. Then its parameters are fine-tuned using a combination of the 
target-domain dataset and generated fault dataset by ADA. 
The effects of the amount of target-domain fault data are evaluated in this section 
under different transfer tasks (A→C, A→D, B→C, and B→D). For Dataset C and 
Dataset D, 11,000 out of 16,000 normal samples are used for training, and the 
remaining 5,000 samples are reserved for testing. For the arcing fault dataset, 3,000 
arcing samples are from those cases where arc faults are generated at the start of the PV 
array, among which 20, 30, 60, 150, 300 of them are randomly selected for the target-
domain training dataset, respectively. On the other hand, another 2,500 new arcing 
samples generated at the same fault location are prepared. Combined with 5,000 arcing 
samples from those cases where arc faults are generated at the middle and the end of the 
PV array, a total number of 7,500 fault samples are prepared for the testing dataset. 
 
166 
These samples are extracted from more than 40 cases, and the arc fault time duration of 
each case is within a few seconds. 
It should be emphasised that only fault samples generated from the start of the PV 
array are used during the training. For testing, apart from using a new set of fault 
samples generated from the same fault location, fault samples from two new cases (arc 
fault generated at the middle and the end of the PV array) are also tested. Therefore, the 
generalisation property of the proposed framework can be validated. 
Figures 6.8-9 compare the overall accuracy (Acc), arcing accuracy (F), and normal 
accuracy (N) of the four domain transfer tasks. These metrics can be calculated using 
(4.26), (4.22), and (4.23), respectively. Different approaches (CNN, LTCNN, LTCNN-
ADA) are extensively evaluated using different number of target domain fault samples. 
The experimental result of each case study is obtained by averaging results from 27 
trials to mitigate the bias and randomness effects during the random selection of dataset. 
It can be easily observed that more fault samples lead to higher accuracies. Also, it is 
noticeable that the normal accuracies in all cases are all close to 100% since sufficient 
normal samples are used for training, while the arcing accuracies heavily depend on the 
number of fault samples. 
 
167 
Acc F
20
N Acc F
30
N Acc F
60
N Acc F
150
N Acc F
300
N
CNN 76.4 60.7 99.9 84 73.4 99.9 87.8 79.7 99.9 95.1 91.8 99.9 97.1 95.2 99.9
LTCNN 95 91.7 99.9 96 93.4 99.9 97 95 99.9 97.9 96.6 99.9 98.5 97.6 99.9
LTCNN-ADA 97.1 95.2 100 97.6 96 100 98 96.7 100 98.6 97.7 100 99 98.3 100
45
50
55
60
65
70
75
80
85
90
95
100
P
e
rc
e
n
ta
g
e
 (
%
)
Acc F
20
N Acc F
30
N Acc F
60
N Acc F
150
N Acc F
300
N
CNN 76.4 60.7 99.9 84 73.4 99.9 87.8 79.7 99.9 95.1 91.8 99.9 97.1 95.2 99.9
LTCNN 96.1 93.5 99.9 96.5 94.2 99.9 97.5 95.8 100 98.3 97.1 99.9 98.7 97.8 99.9
LTCNN-ADA 97.8 96.4 100 98.1 96.8 100 98.4 97.4 100 98.8 98.1 99.9 99.2 98.7 100
45
50
55
60
65
70
75
80
85
90
95
100
P
e
rc
e
n
ta
g
e
 (
%
)
(b)
B C
A C
(a)
 
Figure 6.8 Diagnosis results of different methods for Dataset C: (a) A→C; (b) B→C 
 
168 
Acc F
20
N Acc F
30
N Acc F
60
N Acc F
150
N Acc F
300
N
CNN 77.5 62.5 100 83.4 72.4 100 92.3 87.2 100 95.9 93.3 100 97.9 96.5 99.9
LTCNN 96.8 94.7 100 97.2 95.3 100 98.2 97 99.9 98.8 98.1 99.9 99.2 98.6 100
LTCNN-ADA 98.4 97.3 100 98.6 97.7 100 98.9 98.2 100 99.2 98.8 100 99.5 99.2 100
45
50
55
60
65
70
75
80
85
90
95
100
P
e
rc
e
n
ta
g
e
 (
%
)
Acc F
20
N Acc F
30
N Acc F
60
N Acc F
150
N Acc F
300
N
CNN 77.5 62.5 100 83.4 72.4 100 92.3 87.2 100 95.9 93.3 100 97.9 96.5 99.9
LTCNN 96.5 94.2 99.9 96.9 94.8 100 98.1 96.8 100 98.7 97.9 99.9 99.1 98.6 99.9
LTCNN-ADA 98.2 97.1 100 98.5 97.5 100 98.8 98 100 99.2 98.7 100 99.4 99.1 100
45
50
55
60
65
70
75
80
85
90
95
100
P
e
rc
e
n
ta
g
e
 (
%
)
(b)
B D
(a)
A D
 
Figure 6.9 Diagnosis results of different methods for Dataset D: (a) A→D; (b) B→D 
 
 
 
169 
In all cases, the arcing accuracies of the proposed LTCNN-ADA method 
outperform the other methods, especially when the size of the target domain arcing data 
is smaller. For instance, in A→C task with 𝑁𝑡,𝑎 =20 in Figure 6.8(a), 96.4% arcing 
accuracy can be obtained by the proposed method, while the arcing accuracies of CNN 
and LTCNN are only 60.7% and 93.5%, respectively. On the other hand, when the 
number of fault samples increases to 300, the acing accuracies of CNN, LTCNN, and 
the proposed method are 95.2%, 97.8%, and 98.7%, respectively. In more challenging 
transfer tasks, where the inverters and power sources are completely different between 
the source and target domains, e.g. B→C with 𝑁𝑡,𝑎 =20 as shown in Figure 6.8(b), 
significant improvements can also be observed by LTCNN (31%) and LTCNN-ADA 
(34.5%) as compared to CNN. 
In general, the diagnosis performance of CNN is not reliable because of the low 
accuracy and large standard deviations. LTCNN dramatically improves the performance 
of CNN becauselightweight transfer learning is applied to leverage the learnt 
knowledge from the source domain to assist the training process in the target domain. 
LTCNN-ADA further increases the performance, and the increase is more noticeable 
for smaller fault data size. When 𝑁𝑡,𝑎=20, compared to LTCNN, LTCNN-ADA can 
improve the arc fault detection accuracies by 2.9%, 2.9%, 3.5%, and 2.6% for transfer 
tasks A→C, A→D, B→C, and B→D, respectively. Besides, the standard deviations are 
reduced by approximately half. When 𝑁𝑡,𝑎 increases to 300, 0.9%, 0.5%, 0.7%, and 
0.6% improvements can also be achieved. These improvements are still good since the 
accuracies of LTCNN have already reached around 98%. The mis-operation rates of 
LTCNN, calculated using (6.6), are 2.2%, 1.4%, 2.4%, and 1.4% for four different 
transfer tasks, respectively, while for the LTCNN-ADA are 1.3%, 0.9%, 1.7%, and 
 
170 
0.8% for four different transfer tasks. Therefore, remarkable reduction in mis-operation 
rates ranging from 30%-43% can be achieved. 
𝛼 = 1 − 𝐴𝑟𝑐𝑖𝑛𝑔 = (1 −
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
) × 100% (6.6) 
The impact of the ratio of real fault data over the ADA generated fault data, 
𝑁𝑡,𝑎/(𝑁𝑡,𝑛 − 𝑁𝑡,𝑎), can also be revealed: ADA delivers higher accuracy improvement 
under lower ratio. The reason is that, with the help of ADA, the proposed method can 
capture more generalised features by looking at a broader set of augmented samples. 
Furthermore, standard deviations of the proposed method are lower than other methods, 
which reflects that ADA can stabilise the training process. In general, the proposed 
method is more robust. 
6.4.3. Comparative Study with Related Works 
Although remarkable results have been achieved recently, limited work can be 
found on lightweight transfer learning-based DC series arc fault detection. To 
demonstrate the advantages of the proposed framework, a comparative study is carried 
out against several state-of-the-art methods: 
1. DA-DCGAN as presented in Chapter 5 [191]; 
2. Transfer RNN [192]; 
3. Deep transfer learning with VGG16 [185] (the input data is re-sized to fit 
VGG16; the original output layer is replaced by a 1-neuron output layer for 
binary classification); 
4. Deep transfer learning with VGG-16 and ADA; 
5. Transfer RNN with ADA. 
 
171 
The same dataset used for the proposed LTCNN-ADA is applied to method 1) - 5) 
to benchmark the performance on different transfer tasks (𝑁𝑡,𝑎=60 as the case study). 
Similarly, each result is obtained by averaging results of 27 trials. The performance 
comparison between LTCNN-ADA and other methods are listed in Table 6.3 in detail. 
DA-DCGAN was only validated by transfer task of A→C in [191], while this study 
further explores the performance of DA-DCGAN in other transfer scenarios. The DA-
DCGAN based methodology achieves satisfactory arcing fault detection accuracy in 
A→C with 96.85% and B→D with 97.41%. However, for more challenging tasks, 
B→C and A→D, where the solar inverters are completely different in the source and 
target domain, its fault diagnosis accuracies drop drastically to 80.4% and 78.12% on 
average with large standard deviations. Similarly, the Transfer RNN in method 2) also 
performs mediocrely in B→C and A→D. Method 3) proposed in [185] employs a pre-
trained deep CNN model called VGG16, and it is directly fine-tuned with the target-
domain dataset. Based on the evaluation, 93.96% and 97.09% arcing fault detection 
accuracies can be achieved using dataset C and dataset D. 
The feasibility of applying ADA to deep neural networks is also investigated. 
Method 4) is a modification based on method 3), where the VGG16 is pre-trained with 
the source domain dataset, and then fine-tuned with the target domain dataset and 
artificial fault samples generated by ADA. Improvements of arc fault diagnosis rate in 
range of 1.21% to 2.41% can be obtained in four different transfer tasks compared to 
the original method in [185]. Meanwhile, the corresponding standard deviations also 
reduce approximately by half. Likewise, method 5) is based on method 2), where ADA 
is applied to increase the diagnosis performance and stabilise the training process. As 
expected, noticeable improvements in different transfer tasks can be observed as well. 
 
172 
In Table 6.3, the proposed LTCNN-ADA demonstrates excellent performance in 
arcing fault detection in all cases. It reaches 97.37% in A→C (highest), 96.66% in 
B→C (highest), 97.95% in A→D (second highest), and 98.15% in B→D (second 
highest), respectively. VGG16-ADA has slightly higher arcing fault detection rate for 
A→D (98.32%) and B→D (98.44%) compared to LTCNN-ADA. Although VGG16-
ADA surpasses LTCNN-ADA in these two transfer tasks by 0.37% and 0.29%, the 
trade-offs for accuracy improvements are the exponentially increased computation 
resources: the inference time of VGG16-ADA is about 125 times longer than that of 
LTCNN-ADA (36.34 s as compared to 0.29 s). Therefore, the proposed LTCNN-ADA 
framework can achieve better DC series arc fault detection accuracy with significantly 
reduced inference time, and so it is more suitable for cost-effective real-time 
deployment in resource-constraint edge devices. 
Table 6.3 Performance comparison of different algorithms on different transfer tasks 
using the same dataset 
Method 
Target domain: C 
(Mean±Std) 
Target domain: D 
(Mean±Std) 
Inference 
time 
(per 10k 
samples) 
A → C (F/N) B → C (F/N) A → D (F/N) B → D (F/N) 
DA-DCGAN 
[191] 
96.85±2.73 
99.74±0.20 
80.40±14.23 
99.34±0.42 
78.12±13.56 
99.75±0.19 
97.41±2.66 
99.70±0.22 
0.31 s 
Trasnsfer RNN 
[192] 
93.85±1.14 
99.95±0.03 
87.36±2.29 
99.94±0.03 
89.31±2.16 
99.94±0.03 
92.38±1.71 
99.94±0.04 
13.40 s 
Trasnsfer RNN -
ADA 
96.19±0.86 
99.95±0.04 
93.71±1.25 
99.93±0.03 
93.00±1.38 
99.95±0.04 
95.68±0.98 
99.96±0.02 
13.31 s 
VGG16 [185] 
93.96±1.04 
99.93±0.03 
93.96±1.04 
99.93±0.03 
97.09±0.87 
99.96±0.01 
97.09±0.87 
99.96±0.01 
36.16 s 
VGG16-ADA 
96.37±0.61 
99.94±0.04 
95.93±0.76 
99.95±0.02 
98.30±0.32 
99.97±0.02 
98.44±0.29 
99.96±0.04 
36.34 s 
LTCNN-ADA 
(this work) 
97.37±0.69 
99.95±0.04 
96.66±0.84 
99.96±0.02 
97.95±0.71 
99.95±0.02 
98.15±0.69 
99.96±0.03 
0.29 s 
The inference time of the model is measured on a CentOS 7 Linux operating system 
with a Tesla P100 GPU and an Intel (R) Xeon (R) Gold 6126 CPU. 
 
173 
6.5. Case Study 2: Online Validation Results 
To validate the feasibility of the proposed LTCNN-ADA framework, a prototype 
setup based on an NI CompactRIO is developed and tested on a single-phase and a 
three-phase grid-connected PV systems, respectively. The trained LTCNN (𝑁𝑡,𝑎=60) is 
deployed in the prototype using NI LabView programming. Similarly, multiple-window 
strategy is applied to enhance the performance of the proposed algorithm in real-time as 
described in Section 4.5.4. The flowchart of the overall real-time series arc fault 
detection is illustrated in Figure 4.14. The final alarm signal will be issued when at least 
m=3 samples are determined as arcing in a sliding window consisting of k=10 samples 
with a step size of 1. 
Extensive real-time experiments, including arc fault detection tests and unwanted 
tripping tests, are conducted under the guidance of UL-1699B (2018) with applicable 
cases. Normal transients, such as inrush current induced by inverter switch-on and load 
current drop caused by partial shading from fast-moving clouds, are typical events to 
cause nuisance tripping of traditional AFDs [37]. Some of the example results are 
presented in the rest of this chapter. The real-time signals for each test are illustrated 
using an oscilloscope including: 
• The alarm signal from the prototype (yellow trace, CH1); 
• Loop current (arc current) captured by the current probe with ratio of 10:1 (green 
trace, CH2); 
• Arcvoltage captured by the voltage probe with ratio of 200:1 (blue trace, CH3); 
• Voltage of the PV inverter captured by the voltage probe with ratio of 200:1 (pink 
trace, CH4). 
Furthermore, those signals are saved using the DAQ system during the experiments 
 
174 
so that the arc fault energy can be determined. For AFD tests, the separation speed and 
distance are set to 5 mm/s and 0.8 mm under high irradiance conditions, and 2.5 mm/s 
and 0.8 mm under low irradiance conditions, respectively [15]. 
6.5.1. Three-Phase PV System 
The prototype is firstly tested in a 5-kW three-phase PV system with 12 panels. 
Two inrush currents can be observed in Figure 6.10 during the initialisation stage of the 
three-phase inverter. After initialisation, the inverter starts to export power with MPPT 
and feed the AC grid. No false tripping is experienced during the transients, while an 
alarm signal is raised shortly after the inception of a series arcing. In Figure 6.11, 
several series arc faults followed by a sustained series arcing can be observed, and the 
prototype can produce correct logic output in a timely manner. In Figure 6.12, a partial 
shading event with a relatively long duration (more than 30 seconds) followed by 
several intermittent short partial shading events and a series arcing can be observed. The 
logic output of the prototype is always false until the series arc fault is introduced. The 
proposed algorithm is also tested under the low-irradiance level, where the arc fault 
current is about 2.5A as shown in Figure 6.13. 
Similar experiments are repeated in the same three phase PV system but with only 
6 PV panels. The results are shown in Figure 6.14 to Figure 6.17. The proposed 
algorithm does not respond to large load current drops caused by partial shading, inrush 
currents caused by closing DC disconnect and initialisation of the inverter, and variation 
introduced by MPPT operations as shown in Figure 6.14 and Figure 6.16. Different 
types of series arc faults, including intermittent arc fault, low-irradiance arc fault, and 
sustained arc fault can be accurately detected within a short time duration. 
 
175 
Three phase 
inverter switch on
Inrush current
Exporting power
Arc inception
Alarm
 
Figure 6.10 Response to inverter switch on, start-up, MPPT, and series arcing in a three 
phase PV system (12 panels) 
Intermittent arc faults
Alarms
Sustained arcing
 
Figure 6.11 Response to several intermittent series arcing events generated at the 
middle of the PV array followed by a sustained series arcing in a three phase PV system 
(12 panels) 
 
176 
Partial 
Shading
Intermittent Partial 
Shading
Arc inception
Alarm
 
Figure 6.12 Response to partial shading and series arcing in a three phase PV system 
(12 panels) 
Arc inception
Alarm
 
Figure 6.13 Response to series arcing generated at very low irradiance level in a three 
phase PV system (12 panels) 
 
 
177 
Three phase 
inverter switch on
Inrush current
Exporting power
Arc inception
Alarm
MPPT
 
Figure 6.14 Response to inverter switch on, start-up, MPPT, and series arcing in a three 
phase PV system (6 panels) 
Intermittent arc faults
Alarms
Sustained arcing
 
Figure 6.15 Response to several intermittent series arcing events generated at the 
middle of the PV array followed by a sustained series arcing in a three phase PV system 
(6 panels) 
 
178 
Cover two PV 
panels
Cover one PV 
panel
MPPT
Arc inception
Alarm
 
Figure 6.16 Response to partial shading and series arcing in a three phase PV system (6 
panels) 
Arc inception
Alarm
 
Figure 6.17 Response to series arcing generated at the middle of the PV array at low 
irradiance level in a three phase PV system (6 panels) 
 
 
179 
6.5.2. Single-Phase PV System 
After that, the prototype is tested in a 1.5-kW single phase PV system. In Figure 
6.18 to Figure 6.20, unwanted tripping tests against DC switch operation, MPPT 
operation, and sharp irradiance step changes are performed, and the algorithm works as 
expected without false alarm. Additionally, the proposed algorithm can accurately 
detect different types of series arc fault such as intermittent arcing and sustained arcing. 
An example of series arc fault detection results in the middle of the PV array at low 
irradiance level is shown in Figure 6.21. 
 
A short arc event
Alarm
Inrush current
Exporting power
1Ø inverter 
switch on
Open-circuit
 
Figure 6.18 Response to inverter switch on, start-up, MPPT, and series arcing in a 
single phase PV system (4 panels) 
 
180 
Intermittent arc faults
Alarms
Sustained 
arcing
 
Figure 6.19 Response to several intermittent series arcing events followed by a 
sustained series arcing in a single phase PV system (4 panels) 
Alarm
Arc inception
Partial Shading
 
Figure 6.20 Response to partial shading and series arcing in a single phase PV system 
(4 panels) 
 
181 
Arc inception
Alarm
 
Figure 6.21 Response to series arcing generated at the middle of the PV array at low 
irradiance level in a single phase PV system (4 panels) 
6.6. Conclusion 
This chapter presents a new framework, LTCNN-ADA, which only requires 
limited amount of target-domain fault data but can give better PV series arc fault 
detection accuracy and reduced inference time. It provides an effective solution to 
address the challenges when applied DL for practical series arc fault detection as 
discussed in Section 4.6.1, Section 4.6.2, and Section 4.6.4. 
In LTCNN-ADA, generalised fault features can be extracted from the raw CT 
signals through a lightweight transfer network with source domain knowledge and small 
amount of target domain fault data. ADA is performed to facilitate the process of 
knowledge transfer. Offline experiments are carried out using 4 different datasets with 
different power sources and inverters. Furthermore, real-time validation experiments are 
designed and performed based on the recommendations of UL-1699B (2018) Standard 
 
182 
with applicable cases. Results from the comprehensive analysis demonstrate the 
effectiveness of the proposed framework as well as its generalisation on a small fault 
dataset. 
 
Some of the work described in this chapter has been submitted to peer-review 
journal: 
Shibo Lu, Rui Ma, Tharmakulasingam Sirojan, B.T. Phung, and Daming Zhang, 
“Lightweight Transfer Nets and Adversarial Data Augmentation for Photovoltaic Series 
Arc Fault Detection with Limited Fault Data”, Submitted to Solar Energy, 2020. 
 
 
 
183 
7. Conclusion and Future Work 
7.1. Conclusion 
The main objective of this research is to develop practical and intelligent DC series 
arc fault detection for PV systems protection. The findings are summarised as follows. 
Firstly, this research presents an in-depth literature review covering various aspects 
of DC arc fault detection in modern solar PV applications, including arc fault 
mechanism and type, arc fault modelling, state-of-the-art techniques for arc fault 
detection. The capabilities and limitations of different methods are presented and 
discussed. Useful information about each applied method, such as key methodology, 
sampling frequency, detection time, and accuracy, are summarised and compared. It is 
found that DL has not yet been utilised for investigation in this field and thus it presents 
a research gap to be filled. 
Then, a characteristics study on DC series arc fault is performed, focusing on high-
frequency variation in the arc current. The arc current and its spectrum show 
dependency on the source voltage, load current, and gap distance. The arc fault tends to 
produce less arcing noise when the stable operating point is further away from the 
interrupted point: the high frequency variation induced by the arc tends to decrease with 
increasing source voltage, increasing load current, and smaller gap distance. Therefore,the worst-case scenario can be found accordingly, which can be also used to determine 
the minimum threshold value for conventional detection methods. The UL-1699B 
Standard also takes these factors into considerations and introduces some modifications 
in arcing tests as compared to UL-1699B Outline. For example, the gap distance range 
is reduced from 1.6-6.4 mm to 0.8-2.5mm. Furthermore, a method combines WPD and 
entropy theory is developed to analyse arc current signals in time-frequency domain. It 
 
184 
demonstrates the capability of extracting consistent patterns of series arc fault under 
different conditions. 
In Chapter 4, several barriers preventing ML, especially DL, from implementation 
in practical applications are identified: imbalanced dataset, small dataset, inconsistency 
between training and testing dataset, unlabelled dataset, model complexity and real-time 
capability, and interpretability. These aspects are still not extensively explored in the 
current research and need further development. Potential solutions to these challenges 
have been suggested to facilitate the applications of intelligent series arc fault detection 
in the industry. 
An intelligent DL based series arc fault detection using CNN is proposed. A 
lightweight CNN structure is then designed through hyperparameter tuning. The overall 
number of parameters is less than 1% of that of some state-of-the-art deep CNN models, 
while the detection performance remains superior. It significantly reduces the required 
computation resources during the inference process of the trained DL models, which is 
more feasible for cost-effective real-time deployment in edge devices. 
A comparative study on different popular ML classifiers is carried out using the 
same datasets to examine their effectiveness in PV series arc fault detection. The first 
dataset is prepared using raw CT time-series data without any hand-crafted feature. The 
second dataset is formulated using wavelet packet entropy with 2D feature maps 
extracted from the raw CT signal. 5 conventional ML methods, including shallow MLP, 
Gaussian NB, SVM, and RF, and 5 DL methods, including deep MLP, CNN, SAE, 
LSTM, and Bi-LSTM are evaluated. It is found that conventional ML methods can 
benefit greatly from manual feature extraction. Furthermore, DL models with raw data 
as input can achieve best-in-class overall accuracy. Of all ML methods tested, CNN 
achieves the best overall classification accuracy regardless of the presence of feature 
 
185 
extraction or not. 
In addition, two novel frameworks, DA-DCGAN and LTCNN-ADA, are newly 
introduced in this research for optimising the performance of the proposed lightweight 
CNN based series arc fault detection when lacking sufficient target-domain fault data. 
They also solve the performance degradation of DL algorithms caused by data 
inconsistency between the source-domain data used during the development and the 
target-domain data encountered in operation in the field. The DA-DCGAN firstly learns 
an intelligent normal-to-arcing transformation from the source-domain data using a 
DCGAN. The target-domain dummy arcing data can be generated using target-domain 
normal data by the generator of DCGAN (transformer) with the learnt transformation. 
By including these fake arcing data into the training dataset, the target-domain arcing 
detection accuracy is dramatically improved from 76.56% to 92.12% based on the case 
study in Chapter 5. Then, domain adaptation is performed by including an MMD 
punishment term during the training of the lightweight CNN, which minimises the 
distribution discrepancy between the target-domain and source-domain features. The 
target-domain arcing detection accuracy of the lightweight CNN is further enhanced by 
approximately 5.5% through learning more domain-invariant/sharing features. Hence, a 
robust and reliable fault diagnosis scheme is achieved for the target domain without 
using target-domain fault data. 
LTCNN-ADA is proposed in Chapter 6, which aims to enhance the detection 
performance of the proposed lightweight CNN with limited amount of target-domain 
fault data using transfer learning and ADA. Firstly, ADA using WGAN-GP is 
performed to enlarge the target-domain fault dataset, making the target-domain dataset 
balanced. Then, the lightweight CNN is initialised by the source-domain dataset. It 
leverages the source domain knowledge by getting a better weight initialisation. Then, 
 
186 
all the parameters of the pre-trained lightweight CNN are fine-tuned using the 
augmented target-domain dataset. Based on comprehensive offline analysis, the 
lightweight CNN with transfer learning dramatically improves the performance of the 
original CNN. With the help of ADA, it further increases the performance, and the 
increase is more noticeable for smaller fault data size. For example, when only 20 fault 
target-domain fault samples are used, more than 30% improvements in target-domain 
arcing detection accuracy can be obtained in different transfer tasks. Comparative 
studies with other recent work based on deep transfer network have been carried out. 
Results show that the proposed method can achieve better series arc fault detection 
accuracy in different transfer tasks with significantly reduced computation complexity 
during model inference. The knowledge transfer capability of CNN with lightweight 
structure is confirmed. 
Under the guidance of UL-1699B Standard (2018) with applicable cases, multiple 
real-time experiments on different types of PV systems are carried out to validate the 
effectiveness of DL-based intelligent series arc fault detection. Experimental results 
demonstrate the proposed methods enable fast and accurate detection of different types 
of series arc faults at different fault locations. Robustness against unwanted tripping is 
also confirmed through a series of tests covering different transients during normal 
operating conditions, such as switching on DC disconnect, inverter start-up and 
initialisation, MPPT operation, partial shadings, etc. 
7.2. Suggestions for Future Work 
Although there are a variety of techniques available to detect DC series arc faults in 
PV systems, little research has been done on their localisation. Furthermore, in UL-
1699B Standard (2018), there are no clear requirements regarding localisation of DC 
 
187 
series arc faults. In fact, effective and accurate localisation of DC series arc faults 
allows for rapid response to determine the root cause of failure and to replace of faulty 
components. Therefore, in addition to the development of detection techniques, the 
development of reliable methods for locating DC series arc faults is crucial, especially 
for large-scale PV systems. DL can be used for further research on the localisation of 
DC series arc faults. 
With the development of advanced power electronics and control technologies, DC 
systems are more capable of realising much more complex functions, and thus 
becoming increasingly popular for different types of applications such as smart home, 
energy storage systems, and electrical vehicle charging stations. Similar to PV systems, 
other DC systems also suffer the problems of arcing faults. Up to the completion of this 
thesis, DL has not yet been investigated in these systems in the context of DC arc fault 
detection. Therefore, further investigations on this point are also of interest. In the 
future, the proposed intelligent DC series arc fault detection may be adapted to a 
different application. 
This research has solved many practical challenges that prevent intelligent DC 
series arc fault detection from being applied to industry including imbalanced dataset, 
small dataset, data inconsistency, and model complexity for real-time deployment. 
However, there are still some remaining problems, such as unlabelled dataset and model 
interpretability,................................................. 138 
5.5. CASE STUDY 2: REAL-TIME IMPLEMENTATION AND VALIDATION RESULTS ....... 143 
5.6. CONCLUSION ....................................................................................................... 150 
6. INTELLIGENT DC SERIES ARC FAULT DETECTION IN PV SYSTEMS 
USING LTCNN-ADA WITH LIMITED TARGET-DOMAIN FAULT DATA ........ 151 
6.1. INTRODUCTION.................................................................................................... 151 
 
XIII 
 
6.2. EXPERIMENTAL SETUP ........................................................................................ 152 
6.2.1. Experimental Setup and Conditions in Source Domain ................................. 152 
6.2.2. Experimental Setup and Conditions in Target Domain ................................. 153 
6.3. PROPOSED LTCNN-ADA ................................................................................... 154 
6.3.1. Transfer Learning ........................................................................................... 155 
6.3.2. Wasserstein Generative Adversarial Networks .............................................. 156 
6.3.3. Procedures of the Proposed LTCNN-ADA .................................................... 157 
6.4. CASE STUDY 1: OFFLINE VALIDATION RESULTS ................................................. 161 
6.4.1. Analysis and Evaluation of Generated Arcing Data ...................................... 161 
6.4.1.1. Training Loss Curves Analysis of ADA by WGAN-GP ................... 161 
6.4.1.2. High Dimensional Feature Visualisation ........................................... 162 
6.4.1.3. Frequency Domain Analysis .............................................................. 163 
6.4.2. Results with Different Number of Fault Samples .......................................... 165 
6.4.3. Comparative Study with Related Works ........................................................ 170 
6.5. CASE STUDY 2: ONLINE VALIDATION RESULTS .................................................. 173 
6.5.1. Three-Phase PV System ................................................................................. 174 
6.5.2. Single-Phase PV System ................................................................................ 179 
6.6. CONCLUSION ....................................................................................................... 181 
7. CONCLUSION AND FUTURE WORK ............................................................... 183 
7.1. CONCLUSION ....................................................................................................... 183 
7.2. SUGGESTIONS FOR FUTURE WORK ...................................................................... 186 
 
XIV 
 
REFERENCE ................................................................................................................... 189 
 
 
 
XV 
 
List of Acronyms 
AFCI Arc Fault Circuit Interrupter 
AFD Arc Fault Detector 
ANN Artificial Neural Network 
ASIC Application Specific Integrated Circuit 
Bi-LSTM Bidirectional Long Short-Term Memory 
BN Batch Normalisation 
BPNN Backpropagation Neural Network 
CNN Convolutional Neural Network 
CT Current Transformer 
DA-DCGAN Domain Adaptation and Deep Convolutional Generative Adversarial 
Network 
DAQ Data Acquisition 
DCGAN Deep Convolutional Generative Adversarial Network 
DL Deep Learning 
DT Decision Tree 
DWT Discrete Wavelet Transform 
EMR Electromagnetic Radiation 
FFT Fast Fourier Transform 
FL Fuzzy Logic 
FPGA Field-Programmable Gate Array 
GAN Generative Adversarial Network 
HMM Hidden Markov Model 
 
XVI 
 
kNN k-Nearest Neighbours 
LR Logistic Regression 
LSTM Long Short-Term Memory 
LTCNN-ADA Lightweight Transfer Convolutional Neural Network with 
Adversarial Data Augmentation 
ML Machine Learning 
MLP Multilayer Perceptron 
MMD Maximum Mean Discrepancy 
MPPT Maximum Power Point Tracking 
NB Naïve Bayes 
PCA Principal Component Analysis 
PV Photovoltaic 
ReLU Rectified Linear Unit 
RF Random Forest 
RNN Recurrent Neural Network 
SAE Stack Autoencoder 
SSTDR Spread Spectrum Time Domain Reflectometry 
STC Standard Test Conditions 
STFT Short-Time Fourier Transform 
SVM Support Vector Machine 
t-SNE t-distributed Stochastic Neighbour Embedding 
WGAN-GP Wasserstein Generative Adversarial Network with Gradient Penalty 
WPD Wavelet Packet Decomposition 
 
 
XVII 
 
List of Figures 
Figure 1.1 Examples of arc current waveforms: (a) AC arc fault; (b) DC arc fault ........ 2 
Figure 1.2 Fire due to DC arc faults in Bakersfield, USA in 2009 [10]: (a) Overview of 
burned PV panels; (b) Burned PV conductors .................................................................. 3 
Figure 1.3 Fires due to DC arc faults in Australia in recent years [11] ........................... 4 
Figure 2.1 Typical structure of PV systems ................................................................... 12 
Figure 2.2 Example of possible locations where arcing may occur in PV systems ....... 14 
Figure 2.3 Typical disturbance sources .......................................................................... 17 
Figure 2.4 V-I characteristic of arc ................................................................................ 18 
Figure 2.5 Equivalent circuit representation of series arc fault...................................... 23 
Figure 2.6 An example of hardware structure of the DC series AFD [72] .................... 27 
Figure 2.7 Time and frequency resolutions of the original time-series signal, FFT 
spectrum, STFT spectrogram, and wavelet transform spectrogram. .............................. 32 
Figure 2.8 Comparison of DWT and WPD analysis (3-level as an example) ............... 34 
Figure 2.9 Ensemble ML techniques: (a) Bagging; (b) Boosting; (c) Stacking ............. 44 
Figure 2.10 Simplified transmission line model for SSTDR ......................................... 49 
Figure 3.1 Experimental setup for characteristics study of DC series arc fault ............. 57 
Figure 3.2 Diagram of arc fault generator in UL-1699B Standard ................................ 58 
Figure 3.3 V-I characteristic under different gap distance ............................................. 60 
Figure 3.4 The condition for a stable arcing point ......................................................... 62 
 
XVIII 
 
Figure 3.5 Waveforms of a typical DC series arc fault: (a) arc current; (b) arc voltage 63 
Figure 3.6 Average arc resistance under (a) fixed source voltage; (b) fixed load current
 ......................................................................................................................................... 64 
Figure 3.7 High frequency variation of DC series arc fault at 11A/200V ..................... 67 
Figure 3.8 Average frequency spectrum of the 2-second data at non-arc state and arc 
state at difference arc phase (FFT analysis window is 0.2 seconds) ............................... 68 
Figure 3.9 Wavelet-packet entropy of DC series arc fault at 11A/200V ....................... 69 
Figure 3.10 DC current dependency for different arc phases ........................................ 70 
Figure 3.11 DC load current dependent arc spectrogram under fixed source voltage ... 71 
Figure 3.12 V-I curve for fixed source voltage .............................................................. 72 
Figure 3.13 Wavelet-packet entropy under fixed DC source voltage ............................ 73 
Figure 3.14 DC source voltage dependent arc spectrogram under fixed load current ... 74 
Figure 3.15 V-I curve for fixed load current .................................................................. 74 
Figure 3.16 Wavelet-packet entropy under fixed load current ...................................... 75 
Figure 3.17 Average frequency spectrum of the first 2 seconds data after the gap 
distance reached the desired value (FFTwhich need further investigations. Also, implementing DL-based 
algorithms in FPGA or ASIC can be done in the future for further cost reduction and 
reliability improvement. 
This research presents two novel frameworks for cross-domain DC series arc fault 
detection in PV systems. They are achieved by leveraging the knowledge of a single-
source domain. For example, for LTCNN-ADA, the accurate series arc fault detection 
 
188 
in target-domain PV system is achieved by utilising transfer learning of single-source 
domain and a small number of target-domain labelled fault data. Future research can 
focus on investigation of transfer learning across multiple source domains. In addition, 
transfer learning using unsupervised data from the target domain can also be explored. 
 
 
189 
Reference 
[1] Australian Photovoltaic Institute, “Australian PV market since April 2001.” 
https://pv-map.apvi.org.au/analyses (accessed May 31, 2020). 
[2] D. Smith, “Arc Flash Hazards on Photovoltaic Arrays,” Fort Collins, Colorado, 
USA, 2013. 
[3] M. Earley and J. Sargent, “National Electrical Code 2011 Handbook,” Natl. Fire 
Prot. Assoc. Quincy, MA, 2010. 
[4] K. Klement, “DC Arc Flash Studied for Solar Photovoltaic Systems: Challenges 
and Recommendations,” IEEE Trans. Ind. Appl., vol. 51, no. 5, pp. 4239–4244, 
2015, doi: 10.1109/ESW.2015.7094947. 
[5] M. Sedighizadeh, A. Rezazadeh, and N. I. Elkalashy, “Approaches in High 
Impedance Fault Detection - A Chronological Review,” Adv. Electr. Comput. 
Eng., vol. 10, no. 3, pp. 114–128, Aug. 2010, doi: 10.4316/aece.2010.03019. 
[6] A. Ghaderi, H. L. Ginn, and H. A. Mohammadpour, “High impedance fault 
detection: A review,” Electr. Power Syst. Res., vol. 143, pp. 376–388, Feb. 2017, 
doi: 10.1016/j.epsr.2016.10.021. 
[7] M. Mishra and R. R. Panigrahi, “Taxonomy of high impedance fault detection 
algorithm,” Measurement, vol. 148, p. 106955, Dec. 2019, doi: 
10.1016/j.measurement.2019.106955. 
[8] B. Hao, “AI in arcing-HIF detection: A brief review,” IET Smart Grid, vol. in 
press, Feb. 2020, doi: 10.1049/iet-stg.2019.0091. 
[9] L. Zhu, S. Ji, and Y. Liu, “Generation and developing process of low voltage 
series DC arc,” IEEE Trans. Plasma Sci., vol. 42, no. 10, pp. 2718–2719, Oct. 
2014, doi: 10.1109/TPS.2014.2330419. 
[10] T. Zgonena, L. Ji, and D. Dini, “Photovoltaic DC Arc-Fault Circuit Protection 
and UL Subject 1699B,” in Photovoltaic Module Reliability Workshop, 2011, 
Accessed: May 29, 2020. [Online]. Available: 
https://www.researchgate.net/publication/266054761. 
[11] AC Solar Wholesalers, “Solar Fires - DC Arc Faults.” 
https://www.acsolarwarehouse.com/news/solar-fires-dc-arc-faults-on-solar-
 
190 
systems/ (accessed Aug. 01, 2020). 
[12] B. Brook, “Report of the results of investigation of failure of the 1.1135 MW 
photovoltaic (PV) plant at the national gypsum facility in Mount Holly,” North 
Carolina, USA, 2011. 
[13] National Fire Protection Association, 70(R): National electrical code (R). Quincy, 
MA, USA: (NFPA) National Fire Protection Association, 2014. 
[14] UL 1699B - Outline of investigation for photovoltaic (PV) DC arc-fault circuit 
protection. Underwriters Laboratories, 2013. 
[15] UL 1699B - Standard for Photovoltaic (PV) DC Arc-Fault Circuit Protection. 
Underwriters Laboratories, 2018. 
[16] C. Strobl and P. Meckler, “Arc faults in photovoltaic systems,” in Annual Holm 
Conference on Electrical Contacts, 2010, pp. 216–222, doi: 
10.1109/HOLM.2010.5619538. 
[17] J. Yuventi, “DC electric arc-flash hazard-risk evaluations for photovoltaic 
systems,” IEEE Trans. Power Deliv., vol. 29, no. 1, pp. 161–167, 2014, doi: 
10.1109/TPWRD.2013.2289921. 
[18] S. Dhar, R. K. Patnaik, and P. K. Dash, “Fault Detection and Location of 
Photovoltaic Based DC Microgrid Using Differential Protection Strategy,” IEEE 
Trans. Smart Grid, vol. 9, no. 5, pp. 4303–4312, Sep. 2018, doi: 
10.1109/TSG.2017.2654267. 
[19] S. McCalmont, “Low Cost Arc Fault Detection and Protection for PV Systems,” 
Golden, CO (United States), Oct. 2013. doi: 10.2172/1110454. 
[20] X. Yao, J. Wang, and D. L. Schweickart, “Review and recent developments in 
DC arc fault detection,” in IEEE International Power Modulator and High 
Voltage Conference (IPMHVC), Aug. 2017, pp. 467–472, doi: 
10.1109/IPMHVC.2016.8012887. 
[21] M. K. Alam, F. Khan, J. Johnson, and J. Flicker, “A Comprehensive Review of 
Catastrophic Faults in PV Arrays: Types, Detection, and Mitigation Techniques,” 
IEEE J. Photovoltaics, vol. 5, no. 3, pp. 982–997, May 2015, doi: 
10.1109/JPHOTOV.2015.2397599. 
 
191 
[22] M. G. Villalva, J. R. Gazoli, and E. R. Filho, “Comprehensive approach to 
modeling and simulation of photovoltaic arrays,” IEEE Trans. Power Electron., 
vol. 24, no. 5, pp. 1198–1208, 2009, doi: 10.1109/TPEL.2009.2013862. 
[23] G. Petrone, C. A. Ramos-Paja, G. Spagnuolo, and Weidong Xiao, Photovoltaic 
sources modeling. Chichester, West Sussex, UK: John Wiley & Sons, 2017. 
[24] F. Reil, A. Sepanski, W. Herrmann, J. Althaus, W. Vaaßen, and H. Schmidt, 
“Qualification of arcing risks in PV modules,” in IEEE Photovoltaic Specialists 
Conference (PVSC), 2012, pp. 727–730, doi: 10.1109/PVSC.2012.6317709. 
[25] J. Flicker and J. Johnson, “Electrical simulations of series and parallel PV arc-
faults,” in IEEE Photovoltaic Specialists Conference (PVSC), 2013, pp. 3165–
3172, doi: 10.1109/PVSC.2013.6745127. 
[26] Z. Wang and R. S. Balog, “Arc Fault and Flash Signal Analysis in DC 
Distribution Systems Using Wavelet Transformation,” IEEE Trans. Smart Grid, 
vol. 6, no. 4, pp. 1955–1963, Jul. 2015, doi: 10.1109/TSG.2015.2407868. 
[27] J. Strauch, M. Quintana, J. Granata, W. Bower, and S. Kuszmaul, “Solar module 
arc fault modeling at Sandia National Laboratories,” in 2011 NREL Module 
Reliability Workshop, 2010. 
[28] J. K. Hastings, M. A. Juds, C. J. Luebke, and B. Pahl, “A study of ignition time 
for materials exposed to DC arcing in PV systems,” in IEEE Photovoltaic 
Specialists Conference (PVSC), 2011, pp. 3724–3729, doi: 
10.1109/PVSC.2011.6185959. 
[29] Z. Wang, S. McConnell, R. S. Balog, and J. Johnson, “Arc fault signal detection - 
Fourier transformation vs. wavelet decomposition techniques using synthesized 
data,” in IEEE Photovoltaic Specialists Conference (PVSC), Oct. 2014, pp. 
3239–3244, doi: 10.1109/PVSC.2014.6925625. 
[30] K. M. Armijo, J. Johnson, M. Hibbs, and A. Fresquez, “Characterizing fire 
danger from low-power photovoltaic arc-faults,” in IEEE Photovoltaic 
Specialists Conference (PVSC), Oct. 2014, pp. 3384–3390, doi: 
10.1109/PVSC.2014.6925658. 
[31] F. Erhard, B. Schaller, and F. Berger, “Field test results of serial DC arc fault 
investigationson real photovoltaic systems,” in Universities Power Engineering 
 
192 
Conference (UPEC), Oct. 2014, doi: 10.1109/UPEC.2014.6934689. 
[32] S. C. Wang, C. J. Wu, and Y. J. Wang, “Detection of arc fault on low voltage 
power circuits in time and frequency domain approach,” Int. J. Circuits, Syst. 
Signal Process., vol. 6, no. 5, pp. 324–331, 2012. 
[33] Z. Meng, L. Wang, and Q. Sun, “The characteristics of DC arc faults current,” in 
European Conference on Power Electronics and Applications (EPE), 2013, doi: 
10.1109/EPE.2013.6631914. 
[34] J. Johnson and K. Armijo, “Parametric study of PV arc-fault generation methods 
and analysis of conducted DC spectrum,” in IEEE Photovoltaic Specialists 
Conference (PVSC), Oct. 2014, pp. 3543–3548, doi: 
10.1109/PVSC.2014.6924874. 
[35] J. Johnson, S. S. Kuszmaul, W. I. Bower, and D. A. Schoenwald, “Using PV 
module and line frequency response data to create robust arc fault detectors,” in 
European Photovoltaic Solar Energy Conference, 2011, pp. 3745–3750. 
[36] B. England, “An Investigation into Arc Detection and Fire Safety Aspects of 
Photovoltaic Installations,” Murdoch University, Perth, WA, Australia, 2012. 
[37] J. Johnson, K. M. Armijo, M. Avrutsky, D. Eizips, and S.Kondrashov, “Arc-
fault unwanted tripping survey with UL 1699B-listed products,” in IEEE 
Photovoltaic Specialists Conference (PVSC), Dec. 2015, doi: 
10.1109/PVSC.2015.7356427. 
[38] J. Johnson, C. Oberhauser, M. Montoya, A. Fresquez, S. Gonzalez, and A. Patel, 
“Crosstalk nuisance trip testing of photovoltaic DC arc-fault detectors,” in IEEE 
Photovoltaic Specialists Conference (PVSC), 2012, pp. 1383–1387, doi: 
10.1109/PVSC.2012.6317857. 
[39] S. Chen, X. Li, and J. Xiong, “Series Arc Fault Identification for Photovoltaic 
System Based on Time-Domain and Time-Frequency-Domain Analysis,” IEEE J. 
Photovoltaics, vol. 7, no. 4, pp. 1105–1114, Jul. 2017, doi: 
10.1109/JPHOTOV.2017.2694421. 
[40] J. Schmid, E. Kancsar, M. Drapalik, V. Schlosser, and G. Klinger, “A Study of 
the Current Disturbance Caused by Wind Induced Vibrations of Photovoltaic 
Modules,” in International Conference on Renewable Energies and Power 
 
193 
Quality (ICREPQ), 2010, pp. 135–140, doi: 10.24084/repqj08.256. 
[41] A. D. Stokes, “Electric arcs in open air,” J. Phys. D. Appl. Phys., vol. 24, pp. 26–
35, 1991. 
[42] J. J. Lowke, “Simple theory of free burning arc,” J. Phys. D Appl., vol. 12, pp. 
1873–1886, 1979. 
[43] R. F. Ammerman, T. Gammon, P. K. Sen, and J. P. Nelson, “DC-arc models and 
incident-energy calculations,” IEEE Trans. Ind. Appl., vol. 46, no. 5, pp. 1810–
1819, Sep. 2010, doi: 10.1109/TIA.2010.2057497. 
[44] H. B. Estes, “Horizontal series fault comparison in AC & DC micro-grid 
architectures,” University of Texas at Austin, Austin, TX, USA, 2011. 
[45] N. Gustavsson, “Evaluation and Simulation of Black-box Arc Models for High 
Voltage Circuit-breakers,” Linköping University, Linköping, Sweden, 2004. 
[46] M. Ju and L. Wang, “Arc fault modeling and simulation in DC system based on 
Habedank model,” in Prognostics and System Health Management Conference 
(PHM), Jan. 2017, doi: 10.1109/PHM.2016.7819827. 
[47] L. van der Sluis, W. R. Rutgers, and C. G. A. Koreman, “A physical ARC model 
for the simulation of current zero behavior of high-voltage circuit breakers,” 
IEEE Trans. Power Deliv., vol. 7, no. 2, pp. 1016–1022, 1992, doi: 
10.1109/61.127112. 
[48] W. B. Nottingham, “A new equation for the static characteristic of the normal 
electric arc,” J. Am. Inst. Electr. Eng., vol. 42, no. 1, pp. 12–19, Oct. 2013, doi: 
10.1109/joaiee.1923.6591851. 
[49] D. B. Miller and J. L. Hildenbrand, “DC Arc Model including Circuit 
Constraints,” IEEE Trans. Power Appar. Syst., vol. PAS-92, no. 6, pp. 1926–
1934, 1973, doi: 10.1109/TPAS.1973.293572. 
[50] NFPA, NFPA 70E: Standard for Electrical Safety in the Workplace. Quincy, MA, 
USA: National Fire Protection Association, 2015. 
[51] J. Paukert, “The arc voltage and arc resistance of LV fault arcs,” in international 
symposium on switching arc phenomena, 1993, pp. 49–51. 
[52] X. Yao, L. Herrera, S. Ji, K. Zou, and J. Wang, “Characteristic study and time-
 
194 
domain discrete-wavelet-transform based hybrid detection of series DC arc 
faults,” IEEE Trans. Power Electron., vol. 29, no. 6, pp. 3103–3115, Jun. 2014, 
doi: 10.1109/TPEL.2013.2273292. 
[53] F. M. Uriarte et al., “A DC arc model for series faults in low voltage microgrids,” 
IEEE Trans. Smart Grid, vol. 3, no. 4, pp. 2063–2070, 2012, doi: 
10.1109/TSG.2012.2201757. 
[54] M. Weerasekara, M. Vilathgamuwa, and Y. Mishra, “Modelling of DC arcs for 
photovoltaic system faults,” in IEEE Annual Southern Power Electronics 
Conference (SPEC), 2016, doi: 10.1109/SPEC.2016.7846061. 
[55] V. V. Terzija, M. Popov, V. Stanojevic, and Z. Radojevic, “EMTP simulation 
and spectral domain features of a long arc in free air,” in International 
Conference and Exhibition on Electricity Distribution (CIRED), 2005, pp. 389–
392, doi: 10.1049/cp:20050953. 
[56] X. Yao, L. Herrera, and J. Wang, “Impact evaluation of series dc arc faults in dc 
microgrids,” in IEEE Applied Power Electronics Conference and Exposition 
(APEC), May 2015, pp. 2953–2958, doi: 10.1109/APEC.2015.7104771. 
[57] J. Johnson et al., “Photovoltaic DC arc fault detector testing at Sandia National 
Laboratories,” in IEEE Photovoltaic Specialists Conference (PVSC), 2011, pp. 
3614–3619, doi: 10.1109/PVSC.2011.6185930. 
[58] N. L. Georgijevic, M. V. Jankovic, S. Srdic, and Z. Radakovic, “The detection of 
series arc fault in photovoltaic systems based on the arc current entropy,” IEEE 
Trans. Power Electron., vol. 31, no. 8, pp. 5917–5930, Aug. 2016, doi: 
10.1109/TPEL.2015.2489759. 
[59] J. C. Kim, B. Lehman, and R. Ball, “DC Arc Fault Model Superimposing 
Multiple Random Arc Noise States on an Average Model,” in IEEE Workshop on 
Control and Modeling for Power Electronics (COMPEL), Jun. 2019, doi: 
10.1109/COMPEL.2019.8769649. 
[60] X. Shen, “on-line monitoring of arcing faults in medium voltage network,” 
University of New South Wales, Sydney, NSW, Australia, 2016. 
[61] B. Novak, “Implementing arc detection in solar applications: achieving 
compliance with the new UL 1699B Standard,” 2012. 
 
195 
[62] J. Johnson and J. Kang, “Arc-fault detector algorithm evaluation method utilizing 
prerecorded arcing signatures,” in IEEE Photovoltaic Specialists Conference 
(PVSC), 2012, pp. 1378–1382, doi: 10.1109/PVSC.2012.6317856. 
[63] Y. Ohta and H. Isoda, “Arc detecting device and aircraft equipped therewith,” 
U.S. Patent 8,093,904, 2012. 
[64] Y. Cao, J. Li, M. Sumner, E. Christopher, and D. Thomas, “Arc fault generation 
and detection in DC systems,” in Asia-Pacific Power and Energy Engineering 
Conference (APPEEC), 2013, doi: 10.1109/APPEEC.2013.6837123. 
[65] F. Eger, “Arcing signals in PV power plants - Evaluation of AFD test standards 
and alternative setups based on in-field arc measurements,” in IEEE Photovoltaic 
Specialists Conference (PVSC), Nov. 2016, pp. 1250–1253, doi: 
10.1109/PVSC.2016.7749814. 
[66] J. Johnson, D. Schoenwald, S. Kuszmaul, J. Strauch, and W. Bower, “Creating 
dynamic equivalent PV circuit models with impedance spectroscopy for arc fault 
modeling,” in IEEE Photovoltaic Specialists Conference (PVSC), 2011, pp. 
2328–2333, doi: 10.1109/PVSC.2011.6186419. 
[67] M. I. Fitrianto, E. Wahjono, D. O. Anggriawan, E. Prasetyono, R. H. Mubarok, 
and A. Tjahjono, “Identification and Protection of Series DC Arc Fault for 
Photovoltaic Systems Based on Fast Fourier Transform,” in International 
Electronics Symposium (IES), Sep. 2019, pp. 159–163, doi: 
10.1109/ELECSYM.2019.8901605. 
[68] K. Xia, Z. He, Y. Yuan, Y. Wang, and P. Xu, “An arc fault detection system for 
the household photovoltaic inverter according to the DC bus currents,” in 
International Conference on Electrical Machines and Systems (ICEMS), Jan. 
2016, pp. 1687–1690, doi: 10.1109/ICEMS.2015.7385312. 
[69] J. C. Gu, D. S. Lai, J. M. Wang, J. J. Huang, and M. T. Yang, “Design of a DC 
series arc fault detector for photovoltaic system protection,” IEEE Trans. Ind. 
Appl., vol. 55, no. 3, pp. 2464–2471, May 2019, doi: 10.1109/TIA.2019.2894992. 
[70] G. S. Seo, H. Bae, B. H. Cho, and K. C. Lee, “Arc protection scheme for DC 
distribution systems with photovoltaic generation,” in International Conference 
on Renewable Energy Research and Applications (ICRERA), 2012, doi: 
 
196 
10.1109/ICRERA.2012.6477374. 
[71] C. Aarstad, T. Taufik, A. Kean, and M. Muscarella, “Development of arc fault 
interrupter laboratory testing for low voltage DC electricity,” in International 
Seminar on Intelligent Technology and Its Application (ISITIA), Jan. 2016, pp. 
583–588, doi: 10.1109/ISITIA.2016.7828725. 
[72] G. S. Seo, K. A. Kim, K. C. Lee, K. J. Lee, and B. H. Cho, “A new DC arc fault 
detection method using DC system component modeling and analysis in low 
frequency range,” in IEEE Applied Power Electronics Conference and 
Exposition (APEC), May 2015, pp. 2438–2444, doi: 
10.1109/APEC.2015.7104690. 
[73] M. Kanemaru, K. Kokura, M. Mori, T. Shindoi, and M. Yamamoto,“Identification technique of DC series arc-fault strings in photovoltaic systems,” 
Electr. Eng. Japan (English Transl. Denki Gakkai Ronbunshi), vol. 207, no. 2, pp. 
12–19, 2019, doi: 10.1002/eej.23204. 
[74] S. Chae, J. Park, and S. Oh, “Series DC Arc Fault Detection Algorithm for DC 
Microgrids Using Relative Magnitude Comparison,” IEEE J. Emerg. Sel. Top. 
Power Electron., vol. 4, no. 4, pp. 1270–1278, Dec. 2016, doi: 
10.1109/JESTPE.2016.2592186. 
[75] W. Miao, X. Liu, K. H. Lam, and P. W. T. Pong, “Arc-Faults Detection in PV 
Systems by Measuring Pink Noise with Magnetic Sensors,” IEEE Trans. Magn., 
vol. 55, no. 7, Jul. 2019, doi: 10.1109/TMAG.2019.2903899. 
[76] W. Miao, X. Liu, K. H. Lam, and P. W. T. Pong, “DC-arcing detection by noise 
measurement with magnetic sensing by TMR sensors,” IEEE Trans. Magn., vol. 
54, no. 11, 2018, doi: 10.1109/TMAG.2018.2842187. 
[77] M. Wendl, M. Weiss, and F. Berger, “HF Characterization of Low Current DC 
Arcs at Alterable Conditions,” in International Conference on Electrical 
Contacts (ICEC), 2014, pp. 439–444. 
[78] J. Johnson et al., “Differentiating series and parallel photovoltaic arc-faults,” in 
IEEE Photovoltaic Specialists Conference (PVSC), 2012, pp. 720–726, doi: 
10.1109/PVSC.2012.6317708. 
[79] L. Yuan, J. Shengchang, W. Jin, Y. Xiu, and Z. Yeye, “Study on characteristics 
 
197 
and detection of DC arc fault in power electronics system,” in IEEE International 
Conference on Condition Monitoring and Diagnosis (CMD), 2012, pp. 1043–
1046, doi: 10.1109/CMD.2012.6416335. 
[80] T. A. Ramamurthy and K. S. Swarup, “High Impedance Fault detection using 
DWT for transmission and distribution networks,” in IEEE International 
Conference on Power Systems (ICPS), Oct. 2016, doi: 
10.1109/ICPES.2016.7584004. 
[81] S. Chen and X. Li, “PV series arc fault recognition under different working 
conditions with joint detection method,” in Annual Holm Conference on 
Electrical Contacts, Dec. 2016, pp. 25–32, doi: 10.1109/HOLM.2016.7780002. 
[82] S. V. Narasimhan, N. Basumallick, and S. Veena., Introduction to wavelet 
transform: a signal processing approach. Alpha Science International, Ltd, 2011. 
[83] G. Yunmei, W. Li, W. Zhuoqi, and J. Binfeng, “Wavelet packet analysis applied 
in detection of low-voltage DC arc fault,” in IEEE Conference on Industrial 
Electronics and Applications (ICIEA), 2009, pp. 4013–4016, doi: 
10.1109/ICIEA.2009.5138962. 
[84] Z. Wang and R. S. Balog, “Arc fault and flash detection in DC photovoltaic 
arrays using wavelets,” in IEEE Photovoltaic Specialists Conference (PVSC), 
2013, pp. 1619–1624, doi: 10.1109/PVSC.2013.6744455. 
[85] X. Yao, L. Herrera, and J. Wang, “A series DC arc fault detection method and 
hardware implementation,” in IEEE Applied Power Electronics Conference and 
Exposition (APEC), 2013, pp. 2444–2449, doi: 10.1109/APEC.2013.6520638. 
[86] W. Li, A. Monti, and F. Ponci, “Fault detection and classification in medium 
voltage dc shipboard power systems with wavelets and artificial neural networks,” 
IEEE Trans. Instrum. Meas., vol. 63, no. 11, pp. 2651–2665, Nov. 2014, doi: 
10.1109/TIM.2014.2313035. 
[87] R. S. Balog, “Method and system for detecting arc faults and flashes using 
wavelets,” U.S. Patent 9,329,220, 2016. 
[88] Y. Zhang, L. Wang, and S. Yang, “Research on Characteristics of DC Arc Fault 
Based on Wavelet Transform,” in IEEE International Conference on Electrical 
Systems for Aircraft, Railway, Ship Propulsion and Road Vehicles and 
 
198 
International Transportation Electrification Conference (ESARS-ITEC), Jan. 
2019, doi: 10.1109/ESARS-ITEC.2018.8607619. 
[89] H. Zhu, Z. Wang, and R. S. Balog, “Real time arc fault detection in PV systems 
using wavelet decomposition,” in IEEE Photovoltaic Specialists Conference 
(PVSC), Nov. 2016, pp. 1761–1766, doi: 10.1109/PVSC.2016.7749926. 
[90] C. He, L. Mu, and Y. Wang, “The Detection of Parallel Arc Fault in Photovoltaic 
Systems Based on a Mixed Criterion,” IEEE J. Photovoltaics, vol. 7, no. 6, pp. 
1717–1724, Nov. 2017, doi: 10.1109/JPHOTOV.2017.2742143. 
[91] S. Chen, X. Li, Y. Meng, and Z. Xie, “Wavelet-based protection strategy for 
series arc faults interfered by multicomponent noise signals in grid-connected 
photovoltaic systems,” Sol. Energy, vol. 183, pp. 327–336, 2019, doi: 
10.1016/j.solener.2019.03.008. 
[92] M. Dargatz and M. Fornage, “Method and apparatus for detection and control of 
dc arc faults,” U.S. Patent 8,179,147, 2012. 
[93] D. Kilrey and W. Oldenburg, “DC arc fault detection and protection,” U.S. Patent 
0133135A1, 2007. 
[94] H. Braun et al., “Signal processing for fault detection in photovoltaic arrays,” in 
IEEE International Conference on Acoustics, Speech and Signal Processing 
(ICASSP), 2012, pp. 1681–1684, doi: 10.1109/ICASSP.2012.6288220. 
[95] S. Buddha et al., “Signal processing for photovoltaic applications,” in IEEE 
International Conference on Emerging Signal Processing Applications (ESPA), 
2012, pp. 115–118, doi: 10.1109/ESPA.2012.6152459. 
[96] F. Schimpf and L. E. Narum, “Recognition of electric arcing in the DC-wiring of 
photovoltaic systems,” in International Telecommunications Energy Conference 
(INTELEC), 2009, doi: 10.1109/INTLEC.2009.5352037. 
[97] Y. Gao, J. Zhang, Y. Lin, and Y. Sun, “An innovative photovoltaic DC arc fault 
detection method through multiple criteria algorithm based on a new arc 
initiation method,” in IEEE Photovoltaic Specialists Conference (PVSC), Oct. 
2014, pp. 3188–3192, doi: 10.1109/PVSC.2014.6925613. 
[98] Q. Lu, Z. Ye, M. Su, Y. Li, Y. Sun, and H. Huang, “A DC series arc fault 
detection method using line current and supply voltage,” IEEE Access, vol. 8, pp. 
 
199 
10134–10146, 2020, doi: 10.1109/ACCESS.2019.2963500. 
[99] C. Strobl, “Arc fault detection in DC microgrids,” in IEEE 1st International 
Conference on Direct Current Microgrids (ICDCM), Jul. 2015, pp. 181–186, doi: 
10.1109/ICDCM.2015.7152035. 
[100] C. Strobl, “Arc fault detection – a model-based approach,” in international 
conference on electrical contacts, 2014, pp. 367–373. 
[101] A. R. Jordehi, “Parameter estimation of solar photovoltage (PV) cells: A review,” 
Renew. Sustain. Energy Rev., vol. 61, pp. 354–368, 2016. 
[102] D. T. Cotfas, P. A. Cotfas, and S. Kaplanis, “Methods and techniques to 
determine the dynamic parameters of solar cells: Review,” Renew. Sustain. 
Energy Rev., vol. 61, pp. 213–221, 2016, doi: 10.1016/j.rser.2016.03.051. 
[103] R. D. Telford, S. Galloway, B. Stephen, and I. Elders, “Diagnosis of series DC 
Arc faults - A machine learning approach,” IEEE Trans. Ind. Informatics, vol. 13, 
no. 4, pp. 1598–1609, Aug. 2017, doi: 10.1109/TII.2016.2633335. 
[104] J. A. Momoh and R. Button, “Design and Analysis of Aerospace DC Arcing 
Faults using Fast Fourier Transformation and Artificial Neural Network,” in 
IEEE Power Engineering Society General Meeting, 2003, pp. 788–793, doi: 
10.1109/pes.2003.1270407. 
[105] J. Yang and Y. Wang, “Identification and Detection of DC Arc Fault in 
Photovoltaic Power Generation System,” in International Conference on 
Intelligent Transportation, Big Data & Smart City (ICITBS), Jan. 2020, pp. 440–
444, doi: 10.1109/ICITBS49701.2020.00095. 
[106] Z. Wang and R. S. Balog, “Arc fault and flash detection in photovoltaic systems 
using wavelet transform and support vector machines,” in IEEE Photovoltaic 
Specialists Conference (PVSC), Nov. 2016, pp. 3275–3280, doi: 
10.1109/PVSC.2016.7750271. 
[107] K. Yang, R. Zhang, J. Yang, C. Liu, S. Chen, and F. Zhang, “A Novel Arc Fault 
Detector for Early Detection of Electrical Fires,” Sensors, vol. 16, no. 4, p. 500, 
Apr. 2016, doi: 10.3390/s16040500. 
[108] K. Xia, S. He, Y. Tan, Q. Jiang, J. Xu, and W. Yu, “Wavelet packet and support 
vector machine analysis of series DC ARC fault detection in photovoltaic system,” 
 
200 
IEEJ Trans. Electr. Electron. Eng., vol. 14, no. 2, pp. 192–200, Feb.2019, doi: 
10.1002/tee.22797. 
[109] K. Xia, H. Guo, S. He, W. Yu, J. Xu, and H. Dong, “Binary classification model 
based on machine learning algorithm for the DC serial arc detection in electric 
vehicle battery system,” IET Power Electron., vol. 12, no. 1, pp. 112–119, Jan. 
2019, doi: 10.1049/iet-pel.2018.5789. 
[110] K. Xia et al., “Wavelet entropy analysis and machine learning classification 
model of DC serial arc fault in electric vehicle power system,” IET Power 
Electron., vol. 12, no. 15, pp. 3998–4004, Dec. 2019, doi: 10.1049/iet-
pel.2019.0375. 
[111] Z. Yin, L. Wang, Y. Zhang, Y. Gao, and S. Yang, “The DC Arc Fault Detection 
Method Taken Advantage of WT and MFE,” in Prognostics and System Health 
Management Conference (PHM), Oct. 2019, doi: 10.1109/PHM-
Qingdao46334.2019.8942846. 
[112] B. Grichting, J. Goette, and M. Jacomet, “Cascaded fuzzy logic based arc fault 
detection in photovoltaic applications,” in International Conference on Clean 
Electrical Power: Renewable Energy Resources Impact (ICCEP), Aug. 2015, pp. 
178–183, doi: 10.1109/ICCEP.2015.7177620. 
[113] Y. Zhao, R. Ball, J. Mosesian, J.-F. de Palma, and B. Lehman, “Graph-based 
detection rules for fault detection in solar photovoltaic arrays,” IEEE Trans. 
Power Electron., vol. 30, no. 5, pp. 2848–2858, 2015. 
[114] Y. Zhao, L. Yang, B. Lehman, J. F. De Palma, J. Mosesian, and R. Lyons, 
“Decision tree-based fault detection and classification in solar photovoltaic 
arrays,” in IEEE Applied Power Electronics Conference and Exposition (APEC), 
2012, pp. 93–99, doi: 10.1109/APEC.2012.6165803. 
[115] V. Le, X. Yao, C. Miller, and B. H. Tsao, “Series DC Arc Fault Detection Based 
on Ensemble Machine Learning,” IEEE Trans. Power Electron., vol. 35, no. 8, 
pp. 7826–7839, Aug. 2020, doi: 10.1109/TPEL.2020.2969561. 
[116] V. Le, X. Yao, C. Miller, and T. B. Hung, “Arc fault detection in DC distribution 
using semi-supervised ensemble machine learning,” in IEEE Energy Conversion 
Congress and Exposition (ECCE), Sep. 2019, pp. 2939–2945, doi: 
 
201 
10.1109/ECCE.2019.8913286. 
[117] M. Rabla, E. Tisserand, P. Schweitzer, and J. Lezama, “Arc fault analysis and 
localization by cross-correlation in 270 V DC,” in IEEE 59th Holm Conference 
on Electrical Contacts, 2013, pp. 1–6. 
[118] M. Ahmadi, H. Samet, and T. Ghanbari, “Series Arc Fault Detection in 
Photovoltaic Systems Based on Signal-to-Noise Ratio Characteristics Using 
Cross-Correlation Function,” IEEE Trans. Ind. Informatics, vol. 16, no. 5, pp. 
3198–3209, May 2020, doi: 10.1109/TII.2019.2909753. 
[119] M. Ahmadi, H. Samet, and T. Ghanbari, “Kalman filter–based approach for 
detection of series arc fault in photovoltaic systems,” Int. Trans. Electr. Energy 
Syst., vol. 29, no. 5, p. e2823, May 2019, doi: 10.1002/2050-7038.2823. 
[120] K. Gajula and L. Herrera, “Detection and Localization of Series Arc Faults in DC 
Microgrids using Kalman Filter,” IEEE J. Emerg. Sel. Top. Power Electron., pp. 
1–1, Apr. 2020, doi: 10.1109/jestpe.2020.2987491. 
[121] N. L. Georgijevic, D. Stojic, and Z. Radakovic, “Series arc fault detection in 
photovoltaic system by small-signal impedance and noise monitoring,” Int. Trans. 
Electr. Energy Syst., vol. 30, no. 2, pp. 1–15, 2020, doi: 10.1002/2050-
7038.12234. 
[122] M. K. Alam, F. H. Khan, J. Johnson, and J. Flicker, “PV arc-fault detection using 
spread spectrum time domain reflectometry (SSTDR),” in IEEE Energy 
Conversion Congress and Exposition (ECCE), Nov. 2014, pp. 3294–3300, doi: 
10.1109/ECCE.2014.6953848. 
[123] H. L. Lan and R. C. Zhang, “Current research and development trends on faults 
arc detection method in switch cabinet,” Gaodianya Jishu/High Volt. Eng., vol. 
34, no. 3, pp. 496–499, Mar. 2008. 
[124] Q. Xiong, S. Ji, L. Zhu, L. Zhong, and Y. Liu, “A novel DC arc fault detection 
method based on electromagnetic radiation signal,” IEEE Trans. Plasma Sci., vol. 
45, no. 3, pp. 472–478, Mar. 2017, doi: 10.1109/TPS.2017.2653817. 
[125] C. J. Kim, “Electromagnetic radiation behavior of low-voltage arcing fault,” 
IEEE Trans. Power Deliv., vol. 24, no. 1, pp. 416–423, 2009, doi: 
10.1109/TPWRD.2008.2002873. 
 
202 
[126] J. J. Shea, C. J. Luebke, and K. L. Parker, “RF current produced from DC 
electrical arcing,” in international conference on electrical contacts, 2012, pp. 1–
6, doi: 10.1049/cp.2012.0612. 
[127] Q. Xiong et al., “Electromagnetic Radiation Characteristics of Series DC Arc 
Fault and Its Determining Factors,” IEEE Trans. Plasma Sci., vol. 46, no. 11, pp. 
4028–4036, Nov. 2018, doi: 10.1109/TPS.2018.2864605. 
[128] L. Zhu, J. Li, Y. Liu, and S. Ji, “Initial features of the unintended atmospheric 
pressure dc arcs and their application on the fault detection,” IEEE Trans. 
Plasma Sci., vol. 45, no. 4, pp. 742–748, Apr. 2017, doi: 
10.1109/TPS.2017.2676821. 
[129] G. S. Seo, J. I. Ha, B. H. Cho, and K. C. Lee, “Series arc fault detection method 
based on statistical analysis for dc Microgrids,” in IEEE Applied Power 
Electronics Conference and Exposition (APEC), May 2016, pp. 487–492, doi: 
10.1109/APEC.2016.7467916. 
[130] L. Ryves, D. R. McKenzie, and M. M. M. Bilek, “Cathode-spot dynamics in a 
high-current pulsed arc: A noise study,” IEEE Trans. Plasma Sci., vol. 37, no. 2, 
pp. 365–368, 2009, doi: 10.1109/TPS.2008.2007734. 
[131] S. Lu, B. T. Phung, and D. Zhang, “A comprehensive review on DC arc faults 
and their diagnosis methods in photovoltaic systems,” Renew. Sustain. Energy 
Rev., vol. 89, pp. 88–98, Jun. 2018, doi: 10.1016/j.rser.2018.03.010. 
[132] P. G. Slade, “The Arc and Interruption,” in Electrical Contact: Principles and 
Applications, 2nd Editio., CRC Press, 2013, pp. 554–616. 
[133] A. I. Aio, “Modelization and analysis of the electric arc in low voltage circuit 
breakers,” University of The Basque Country, Spain, 2013. 
[134] D. Hanbay, I. Turkoglu, and Y. Demir, “Prediction of wastewater treatment plant 
performance based on wavelet packet decomposition and neural networks,” 
Expert Syst. Appl., vol. 34, no. 2, pp. 1038–1043, Feb. 2008, doi: 
10.1016/j.eswa.2006.10.030. 
[135] V. Kecman, Support Vector Machines: Theory and Applications. Springer 
Science & Business Media, 2005. 
[136] I. Goodfellow, B. Yoshua, and C. Aaron, Deep Learning. MIT Press, 2016. 
 
203 
[137] Y. Lecun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, 
pp. 436–444, May 2015, doi: 10.1038/nature14539. 
[138] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network 
training by reducing internal covariate shift,” in International Conference on 
Machine Learning (ICML), Feb. 2015, pp. 448–456. 
[139] V. Nair and G. E. Hinton, “Rectified Linear Units Improve Restricted Boltzmann 
Machines,” in international conference on machine learning (ICML), 2010, pp. 
807–814. 
[140] A. Graves and J. Schmidhuber, “Framewise phoneme classification with 
bidirectional LSTM and other neural network architectures,” Neural Networks, 
vol. 18, no. 5–6, pp. 602–610, Jul. 2005, doi: 10.1016/j.neunet.2005.06.042. 
[141] J. P. Ram, H. Manghani, D. S. Pillai, T. S. Babu, M. Miyatake, and N. Rajasekar, 
“Analysis on solar PV emulators: A review,” Renew. Sustain. Energy Rev., vol. 
81, pp. 149–160, Jan. 2018, doi: 10.1016/j.rser.2017.07.039. 
[142] X. Jiang, J. Sun, C. Li, and H. Ding, “Video Image Defogging Recognition 
Based on Recurrent Neural Network,” IEEE Trans. Ind. Informatics, vol. 14, no. 
7, pp. 3281–3288, Jul. 2018, doi: 10.1109/TII.2018.2810188. 
[143] Z. Liu, Z. Wu, T. Li, J. Li, and C. Shen, “GMM and CNN Hybrid Method for 
Short Utterance Speaker Recognition,” IEEE Trans. Ind. Informatics, vol. 14, no. 
7, pp. 3244–3252, Jul. 2018, doi: 10.1109/TII.2018.2799928. 
[144] B. Luo, H. Wang, H. Liu, B. Li, and F. Peng, “Early Fault Detection of Machine 
Tools Based on Deep Learning and Dynamic Identification,” IEEE Trans. Ind. 
Electron., vol. 66, no. 1, pp. 509–518, Jan.2018, doi: 10.1109/TIE.2018.2807414. 
[145] Z. Zeng, Z. Li, D. Cheng, H. Zhang, K. Zhan, and Y. Yang, “Two-Stream 
Multirate Recurrent Neural Network for Video-Based Pedestrian 
Reidentification,” IEEE Trans. Ind. Informatics, vol. 14, no. 7, pp. 3179–3186, 
Jul. 2018, doi: 10.1109/TII.2017.2767557. 
[146] C. Sun, M. Ma, Z. Zhao, and X. Chen, “Sparse Deep Stacking Network for Fault 
Diagnosis of Motor,” IEEE Trans. Ind. Informatics, vol. 14, no. 7, pp. 3261–
3270, Jul. 2018, doi: 10.1109/TII.2018.2819674. 
[147] M. F. Guo, X. D. Zeng, D. Y. Chen, and N. C. Yang, “Deep-Learning-Based 
 
204 
Earth Fault Detection Using Continuous Wavelet Transform and Convolutional 
Neural Network in Resonant Grounding Distribution Systems,” IEEE Sens. J., 
vol. 18, no. 3, pp. 1291–1300, Feb. 2018, doi: 10.1109/JSEN.2017.2776238. 
[148] G. Jiang, H. He, J. Yan, and P. Xie, “Multiscale Convolutional Neural Networks 
for Fault Diagnosis of Wind Turbine Gearbox,” IEEE Trans. Ind. Electron., vol. 
66, no. 4, pp. 3196–3207, Apr. 2019, doi: 10.1109/TIE.2018.2844805. 
[149] M. Zhao, M. Kang, B. Tang, and M. Pecht, “Deep Residual Networks with 
Dynamically Weighted Wavelet Coefficients for Fault Diagnosis of Planetary 
Gearboxes,” IEEE Trans. Ind. Electron., vol. 65, no. 5, pp. 4290–4300, May 
2018, doi: 10.1109/TIE.2017.2762639. 
[150] J. Wang, R. Zhao, D. Wang, R. Yan, K. Mao, and F. Shen, “Machine health 
monitoring using local feature-based gated recurrent unit networks,” IEEE Trans. 
Ind. Electron., vol. 65, no. 2, pp. 1539–1548, Jul. 2017, doi: 
10.1109/TIE.2017.2733438. 
[151] S. Chen, X. Li, Z. Xie, and Y. Meng, “Time-Frequency Distribution 
Characteristic and Model Simulation of Photovoltaic Series Arc Fault With 
Power Electronic Equipment,” IEEE J. Photovoltaics, vol. 9, no. 4, pp. 1128–
1137, Jul. 2019, doi: 10.1109/JPHOTOV.2019.2915337. 
[152] M. Ahmadi, H. Samet, and T. Ghanbari, “Series Arc Fault Detection in 
Photovoltaic Systems Based on Signal-to-Noise Ratio Characteristics Using 
Cross-Correlation Function,” IEEE Trans. Ind. Informatics, vol. 16, no. 5, pp. 
3198–3209, May 2020, doi: 10.1109/TII.2019.2909753. 
[153] L. Wen, X. Li, L. Gao, and Y. Zhang, “A New Convolutional Neural Network-
Based Data-Driven Fault Diagnosis Method,” IEEE Trans. Ind. Electron., vol. 65, 
no. 7, pp. 5990–5998, Jul. 2018, doi: 10.1109/TIE.2017.2774777. 
[154] R. Liu, G. Meng, B. Yang, C. Sun, and X. Chen, “Dislocated Time Series 
Convolutional Neural Architecture: An Intelligent Fault Diagnosis Approach for 
Electric Machine,” IEEE Trans. Ind. Informatics, vol. 13, no. 3, pp. 1310–1320, 
Jun. 2017, doi: 10.1109/TII.2016.2645238. 
[155] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning 
applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2323, 
 
205 
1998, doi: 10.1109/5.726791. 
[156] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with 
Deep Convolutional Neural Networks,” 2012. Accessed: Aug. 04, 2020. [Online]. 
Available: http://code.google.com/p/cuda-convnet/. 
[157] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-
Scale Image Recognition,” in International Conference on Learning 
Representations (ICLR), 2015, Accessed: Aug. 04, 2020. [Online]. Available: 
http://arxiv.org/abs/1409.1556. 
[158] A. Khan, A. Sohail, U. Zahoora, and A. S. Qureshi, “A survey of the recent 
architectures of deep convolutional neural networks,” Artif. Intell. Rev., pp. 1–62, 
Apr. 2020, doi: 10.1007/s10462-020-09825-6. 
[159] Y. Wang, F. Zhang, and S. Zhang, “A New Methodology for Identifying Arc 
Fault by Sparse Representation and Neural Network,” IEEE Trans. Instrum. 
Meas., vol. 67, no. 11, pp. 2526–2537, 2018, doi: 10.1109/TIM.2018.2826878. 
[160] S. Lu, H. Chai, A. Sahoo, and B. T. Phung, “Condition Monitoring based on 
Partial Discharge Diagnostics using Machine Learning Methods : A 
Comprehensive State-of-the-Art Review,” IEEE Trans. Dielectr. Electr. Insul., 
vol. In Press, 2020. 
[161] H. Zheng et al., “Cross-Domain Fault Diagnosis Using Knowledge Transfer 
Strategy: A Review,” IEEE Access, vol. 7, pp. 129260–129290, 2019, doi: 
10.1109/ACCESS.2019.2939876. 
[162] S. Lu, B. T. Phung, and D. Zhang, “Study on DC series arc fault in photovoltaic 
systems for condition monitoring purpose,” in Australasian Universities Power 
Engineering Conference (AUPEC), Feb. 2018, pp. 1–6, doi: 
10.1109/AUPEC.2017.8282464. 
[163] J. Andrea, D. Jung, P. Schweitzer, B. Vidales, E. Calderon, and S. Weber, 
“Simulation of Arcing Fault in PV Panel Network,” in Electrical Contacts, 
Proceedings of the Annual Holm Conference on Electrical Contacts, Jan. 2019, 
pp. 329–335, doi: 10.1109/HOLM.2018.8611743. 
[164] E. Strubell, A. Ganesh, and A. McCallum, “Energy and Policy Considerations for 
Deep Learning in NLP,” in Annual Meeting of the Association for Computational 
 
206 
Linguistics (ACL), Jun. 2019, pp. 3645–3650, doi: 10.18653/v1/p19-1355. 
[165] Q. shi Zhang and S. chun Zhu, “Visual interpretability for deep learning: a 
survey,” Front. Inf. Technol. Electron. Eng., vol. 19, no. 1, pp. 27–39, Jan. 2018, 
doi: 10.1631/FITEE.1700808. 
[166] G. Montavon, W. Samek, and K. R. Müller, “Methods for interpreting and 
understanding deep neural networks,” Digit. Signal Process., vol. 73, pp. 1–15, 
Feb. 2018, doi: 10.1016/j.dsp.2017.10.011. 
[167] T. Sirojan, S. Lu, B. T. Phung, D. Zhang, and E. Ambikairajah, “Sustainable 
Deep Learning at Grid Edge for Real-time High Impedance Fault Detection,” 
IEEE Trans. Sustain. Comput., Nov. 2018, doi: 10.1109/tsusc.2018.2879960. 
[168] T. De Bruin, K. Verbert, and R. Babuska, “Railway Track Circuit Fault 
Diagnosis Using Recurrent Neural Networks,” IEEE Trans. Neural Networks 
Learn. Syst., vol. 28, no. 3, pp. 523–533, Mar. 2017, doi: 
10.1109/TNNLS.2016.2551940. 
[169] F. Reil, A. Sepanski, S. Raubach, M. Vosen, and E. Dietrich, “Comparison of 
different DC Arc spectra - Derivation of proposals for the development of an 
international arc fault detector standard,” in IEEE Photovoltaic Specialists 
Conference (PVSC), 2013, pp. 1589–1593, doi: 10.1109/PVSC.2013.6744449. 
[170] E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell, “Adversarial discriminative 
domain adaptation,” in IEEE Conference on Computer Vision and Pattern 
Recognition (CVPR), Nov. 2017, pp. 2962–2971, doi: 10.1109/CVPR.2017.316. 
[171] C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A survey on deep 
transfer learning,” in Lecture Notes in Computer Science (including subseries 
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Oct. 
2018, pp. 270–279, doi: 10.1007/978-3-030-01424-7_27. 
[172] V. M. Patel, R. Gopalan, R. Li, and R. Chellappa, “Visual Domain Adaptation: A 
survey of recent advances,” IEEE Signal Process. Mag., vol. 32, no. 3, pp. 53–69, 
May 2015, doi: 10.1109/MSP.2014.2347059. 
[173] P. Cao, S. Zhang, and J. Tang, “Preprocessing-Free Gear Fault Diagnosis Using 
Small Datasets with Deep Convolutional Neural Network-Based Transfer 
Learning,” IEEE Access, vol. 6, pp. 26241–26253, May 2018, doi: 
 
207 
10.1109/ACCESS.2018.2837621. 
[174] L. Guo, Y. Lei, S. Xing, T. Yan, and N. Li, “Deep Convolutional Transfer 
Learning Network: A New Method for Intelligent Fault Diagnosis of Machines 
with Unlabeled Data,” IEEE Trans. Ind. Electron., vol. 66, no. 9, pp. 7316–7325, 
Sep. 2019, doi: 10.1109/TIE.2018.2877090. 
[175] L. Wen, L. Gao, and X. Li, “A new deep transfer learning based on sparse auto-
encoder for fault diagnosis,” IEEE Trans. Syst. Man, Cybern. Syst., vol. 49, no. 1, 
pp. 136–144, Jan. 2019, doi: 10.1109/TSMC.2017.2754287. 
[176] I. J. Goodfellow et al., “Generative Adversarial Nets,” in Conference on Neural 
Information Processing Systems (NIPS), 2014, pp. 2672–2680, Accessed: May 
31, 2020. [Online]. Available: http://www.github.com/goodfeli/adversarial. 
[177] A. Radford, L.Metz, and S. Chintala, “Unsupervised representation learning with 
deep convolutional generative adversarial networks,” in International Conference 
on Learning Representations (ICLR), 2016. 
[178] K. M. Borgwardt, A. Gretton, M. J. Rasch, H.-P. Kriegel, B. Schö Lkopf, and A. 
J. Smola, “Integrating structured biological data by Kernel Maximum Mean 
Discrepancy,” Bioinformatics, vol. 22, no. 14, pp. 49–57, 2006, doi: 
10.1093/bioinformatics/btl242. 
[179] S. J. Pan, I. W. Tsang, J. T. Kwok, and Q. Yang, “Domain adaptation via transfer 
component analysis,” IEEE Trans. Neural Networks, vol. 22, no. 2, pp. 199–210, 
Feb. 2011, doi: 10.1109/TNN.2010.2091281. 
[180] W. Lu, B. Liang, Y. Cheng, D. Meng, J. Yang, and T. Zhang, “Deep Model 
Based Domain Adaptation for Fault Diagnosis,” IEEE Trans. Ind. Electron., vol. 
64, no. 3, pp. 2296–2305, Mar. 2017, doi: 10.1109/TIE.2016.2627020. 
[181] L. Van Der Maaten, A. Courville, R. Fergus, and C. Manning, “Accelerating t-
SNE using Tree-Based Algorithms,” J. Mach. Learn. Res., vol. 15, pp. 3221–
3245, 2014, Accessed: May 31, 2020. [Online]. Available: 
http://homepage.tudelft.nl/19j49/tsne; 
[182] L. Van Der Maaten and G. Hinton, “Visualizing Data using t-SNE,” J. Mach. 
Learn. Res., vol. 9, pp. 2579–2605, 2008. 
[183] L. Bottou, O. Chapelle, D. DeCoste, and J. Weston, “Support Vector Machine 
 
208 
Solvers,” in Large Scale Kernel Machines, 2007, pp. 1–27. 
[184] K. He and J. Sun, “Convolutional neural networks at constrained time cost,” in 
IEEE Computer Society Conference on Computer Vision and Pattern 
Recognition (CVPR), Oct. 2015, pp. 5353–5360, doi: 
10.1109/CVPR.2015.7299173. 
[185] S. Shao, S. McAleer, R. Yan, and P. Baldi, “Highly Accurate Machine Fault 
Diagnosis Using Deep Transfer Learning,” IEEE Trans. Ind. Informatics, vol. 15, 
no. 4, pp. 2446–2455, Apr. 2019, doi: 10.1109/TII.2018.2864759. 
[186] H. Oh, J. H. Jung, B. C. Jeon, and B. D. Youn, “Scalable and Unsupervised 
Feature Engineering Using Vibration-Imaging and Deep Learning for Rotor 
System Diagnosis,” IEEE Trans. Ind. Electron., vol. 65, no. 4, pp. 3539–3549, 
Apr. 2018, doi: 10.1109/TIE.2017.2752151. 
[187] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, “Improved 
Training of Wasserstein GANs,” in Conference on Neural Information 
Processing Systems (NIPS), 2017, [Online]. Available: 
https://arxiv.org/abs/1704.00028. 
[188] A. Brittomattos, D. Augustoborges Oliveira, and E. Dasilva Morais, “Improving 
CNN-Based Viseme Recognition Using Synthetic Data,” in IEEE International 
Conference on Multimedia and Expo, Oct. 2018, doi: 
10.1109/ICME.2018.8486470. 
[189] N. Anantrasirichai, J. Biggs, F. Albino, and D. Bull, “A deep learning approach 
to detecting volcano deformation from satellite imagery using synthetic datasets,” 
Remote Sens. Environ., vol. 230, p. 111179, Sep. 2019, doi: 
10.1016/j.rse.2019.04.032. 
[190] H. Arnout, J. Kehrer, J. Bronner, and T. Runkler, “Visual Evaluation of 
Generative Adversarial Networks for Time Series Data,” in AAAI Fall 
Symposium, 2019, [Online]. Available: https://arxiv.org/abs/2001.00062. 
[191] S. Lu, T. Sirojan, B. T. Phung, D. Zhang, and E. Ambikairajah, “DA-DCGAN: 
An Effective Methodology for DC Series Arc Fault Diagnosis in Photovoltaic 
Systems,” IEEE Access, vol. 7, pp. 45831–45840, 2019, doi: 
10.1109/ACCESS.2019.2909267. 
 
209 
[192] A. Zhang et al., “Transfer Learning with Deep Recurrent Neural Networks for 
Remaining Useful Life Estimation,” Appl. Sci., vol. 8, no. 12, p. 2416, Nov. 2018, 
doi: 10.3390/app8122416. 
 
	Title Page : INTELLIGENT DC SERIES ARC FAULT DETECTION USING DEEP LEARNING IN PHOTOVOLTAIC SYSTEMS
	Acknowledgement
	Abstract
	Table of contents
	List of Acronyms
	List of Figures
	List of Tables
	1. Introduction
	1.1. Background and Research Motivation
	1.2. Summary of Research Contributions
	1.3. Thesis Organisation
	1.4. List of Publications
	2. Literature Review
	2.1. Introduction
	2.2. DC Arc in Photovoltaic Systems
	2.2.1. Photovoltaic Systems Structure and Arc Hazards
	2.2.2. Challenges to Detect DC Arc Faults
	2.3. DC Arc Models
	2.3.1. Physics-based Arc Model
	2.3.2. V-I Characteristic-based Arc Model
	2.3.2.1. Nottingham Arc Model
	2.3.2.2. Hall, Myer, and Viicheck Arc Model
	2.3.2.3. Stokes and Oppenlander Arc Model
	2.3.2.4. Paukert Arc Model
	2.3.2.5. Modified Paukert Arc Model
	2.3.3. Heuristic Arc Model
	2.3.4. High-Frequency Variation Caused by Arc Faults
	2.4. DC Arc Faults Detection Methods in PV Systems
	2.4.1. Sensors for Measurement
	2.4.2. Fast Fourier Transform
	2.4.3. Short Time Fourier Transform
	2.4.4. Wavelet Transform
	2.4.5. Statistical Analysis
	2.4.6. Model-based Methods
	2.4.7. Machine Learning based Methods
	2.4.8. Other Types of Methods
	2.5. Discussion and Conclusion
	3. Characteristics Study on DC Series Arc Fault
	3.1. Introduction
	3.2. Experimental Setup
	3.3. Static Characteristics
	3.3.1. V-I Characteristics
	3.3.2. Stable Operating Point
	3.3.3. Load Current Effect and Source Voltage Effect
	3.4. High-frequency Variation in Arc Current
	3.4.1. Wavelet Packet Entropy
	3.4.2. Effect of Arc Phase
	3.4.3. Load Current Effect
	3.4.4. Source Voltage Effect
	3.4.5. Gap Distance Effect
	3.5. Discussion and Conclusion
	4. DC Series Arc Fault Detection in PV systems using Deep Learning
	4.1. Introduction
	4.2. Classical Machine Learning
	4.2.1. Artificial Neural Network
	4.2.2. Support Vector Machine
	4.2.3. Decision Tree and Random Forest
	4.2.4. k-Nearest Neighbours
	4.2.5. Others
	4.3. Deep Learning
	4.3.1. Deep Fully-Connected Neural Network
	4.3.2. Autoencoder
	4.3.3. Convolutional Neural Network
	4.3.4. Recurrent Neural Network
	4.4. Experimental Setup
	4.5. Proposed Deep Learning based Series Arc Fault Detection Method using Convolutional Neural Network
	4.5.1. Dataset Preparation
	4.5.2. Hyperparameters Setting and Offline Validation Results
	4.5.2.1. Size of Filter
	4.5.2.2. Number of Filters in the Convolution Layer
	4.5.2.3. Number of Convolution Layers
	4.5.2.4. Number of Fully Connected Layers and Number of Neurons in Each Layer
	4.5.2.5. Comparison with Very Deep CNNs
	4.5.3. Evaluation of Different ML Classifiers
	4.5.3.1. Datasets Preparation
	4.5.3.2. Settings for Different ML Classifiers
	4.5.3.3. Results of Comparative Study
	4.5.4. Real-time Implementation and Validation Results
	4.6. Discussion and Recommendations
	4.6.1. Imbalanced Dataset or Small Dataset
	4.6.2. Inconsistency between Training and Testing Dataset
	4.6.3. Unlabelled Dataset
	4.6.4. ML Model Complexity and Real-time Capability
	4.6.5. Model Interpretability
	4.7. Conclusion
	5. Intelligent DC Series Arc Fault Detection in PV Systems using DA-DCGAN without Target-Domain Fault Data
	5.1. Introduction
	5.2. Experimental Setup
	5.2.1. Experimental Setup and Conditions in Source Domain
	5.2.2. Experimental Setup and Conditions in Target Domain
	5.3. Proposed DA-DCGAN
	5.3.1. Generative Adversarial Networks
	5.3.2. Optimisation Procedures and Deep Learning Model Structures
	5.4. Case Study 1: Offline Validation Results
	5.5. Case Study 2: Real-time Implementation and Validation Results
	5.6. Conclusion
	6. Intelligent DC Series Arc Fault Detection in PV Systems using LTCNN-ADA with Limited Target-Domain Fault Data
	6.1. Introduction
	6.2. Experimental Setup
	6.2.1. Experimental Setup and Conditions in Source Domain
	6.2.2. Experimental Setup and Conditions in Target Domain
	6.3. Proposed LTCNN-ADA
	6.3.1. Transfer Learning
	6.3.2. Wasserstein Generative Adversarial Networks
	6.3.3. Procedures of the Proposed LTCNN-ADA
	6.4. Case Study 1: Offline Validation Results
	6.4.1. Analysis and Evaluation of Generated Arcing Data
	6.4.1.1. Training Loss Curves Analysis of ADA by WGAN-GP
	6.4.1.2. High Dimensional Feature Visualisation
	6.4.1.3. Frequency Domain Analysis
	6.4.2. Results with Different Number of Fault Samples
	6.4.3. Comparative Study with Related Works
	6.5. Case Study 2: Online Validation Results6.5.1. Three-Phase PV System
	6.5.2. Single-Phase PV System
	6.6. Conclusion
	7. Conclusion and Future Work
	7.1. Conclusion
	7.2. Suggestions for Future Work
	Referenceanalysis window is 0.2 seconds) .................... 76 
Figure 4.1 General procedures for ML-based DC arc fault detection methods ............. 80 
Figure 4.2 Structure of a back propagation neural network (MLP) ............................... 81 
Figure 4.3 Illustration of the backpropagation algorithm .............................................. 82 
Figure 4.4 Gradient descent with different learning rate ............................................... 83 
Figure 4.5 A simple linear SVM for classification ........................................................ 85 
 
XIX 
 
Figure 4.6 A simple illustration of decision tree for series arc fault detection .............. 86 
Figure 4.7 Diagram of a SAE for series arc fault detection ........................................... 91 
Figure 4.8 General structure for a convolutional neural network .................................. 93 
Figure 4.9 Diagram of a LSTM block in a LSTM model .............................................. 95 
Figure 4.10 Schematic diagram of the experimental setup ............................................ 96 
Figure 4.11 Original CNN architecture: (a) LeNet 5; (b) AlexNet; (c) VGG 16 ......... 101 
Figure 4.12 The optimal lightweight CNN structure and feature visualisation ........... 103 
Figure 4.13 Feature extraction process and visualisation of normal/arcing feature map
 ....................................................................................................................................... 110 
Figure 4.14 Flowchart of real-time series arc fault detection ...................................... 116 
Figure 4.15 Response to small step changes induced by irradiance level changes ...... 118 
Figure 4.16 Response to a relatively large step change ............................................... 118 
Figure 4.17 A mis-operation is experienced during normal conditions ....................... 119 
Figure 4.18 Response to start-up transients and MPPT operation of the inverter ....... 119 
Figure 4.19 Response to series arc fault during inverter start-up and MPPT operation
 ....................................................................................................................................... 120 
Figure 4.20 Response to several intermittent series arc faults followed by a sustained 
arcing ............................................................................................................................. 121 
Figure 4.21 Response to several intermittent series arc faults followed by an arcing 
with increasing gap distance ......................................................................................... 121 
Figure 4.22 A malfunction experienced during arc conditions .................................... 122 
 
XX 
 
Figure 5.1 Experimental setup and schematic representation for target-domain data 
collection and real-time validation tests ........................................................................ 131 
Figure 5.2 General framework of a generative adversarial network ............................ 133 
Figure 5.3 Overview of DA-DCGAN for DC series arc fault diagnosis in PV systems
 ....................................................................................................................................... 133 
Figure 5.4 Healthy signal capture by CT from source domain and target domain ...... 140 
Figure 5.5 Visualisation of high-level features in the lightweight CNN before 
classification layer using t-SNE method: (a) DA-DCGAN without MMD; (b) proposed 
DA-DCGAN. ................................................................................................................ 141 
Figure 5.6 Impact of kernel numbers in the convolution layer of the lightweight CNN 
on the performance of target-domain DC series arc fault diagnosis ............................. 142 
Figure 5.7 Response to DC disconnect switch closing, inrush current during 
initialisation of inverter, start-up, and MPPT operation ............................................... 145 
Figure 5.8 Response to fast moving cloud and a series arc fault at high irradiance level 
(10A, full loading): (a) 5s per division; (b) 200ms per division (zoomed) ................... 146 
Figure 5.9 Response to a fast shading disturbance....................................................... 147 
Figure 5.10 Response to a series arc fault at low irradiance level in a cloudy day ...... 147 
Figure 5.11 Response to several intermittent series arc faults followed by a sustained 
arc fault: (a) 2s per division; (b) 200ms per division (zoomed) .................................... 148 
Figure 5.12 Response to a series arc fault generated at middle of the PV string on a 
sunny day: (a) 1s per division; (b) 100ms per division (zoomed)................................. 149 
Figure 6.1 Experimental setup in: (a) source domain; (b) target domain .................... 153 
 
XXI 
 
Figure 6.2 Illustrations of traditional ML methods and ML methods enhanced by 
transfer learning ............................................................................................................ 156 
Figure 6.3 Examples of experimental signals (sampled at 200-kHz for analysis purpose) 
and their frequency spectra under normal and arcing conditions in (a) single-phase PV 
system; (b) three-phase PV system. .............................................................................. 158 
Figure 6.4 Overview of the proposed LTCNN-ADA framework ................................ 161 
Figure 6.5 Training loss curves of WGAN-GP with different 𝑁𝑡, 𝑎 ........................... 162 
Figure 6.6 2D visualisation using t-SNE under different transfer tasks (𝑁𝑡, 𝑎 = 60) . 163 
Figure 6.7 (a) examples of three phase normal signal; (b) examples of three phase 
arcing signal; (c) examples of generated three phase arcing signal; (d) frequency spectra 
of the normalised time-series signals; those signals are randomly selected from the 
training dataset D and generated arcing dataset by ADA in the transfer task of A→D 
(𝑁𝑡, 𝑎 = 60) .................................................................................................................. 164 
Figure 6.8 Diagnosis results of different methods for Dataset C: (a) A→C; (b) B→C
 ....................................................................................................................................... 167 
Figure 6.9 Diagnosis results of different methods for Dataset D: (a) A→D; (b) B→D
 ....................................................................................................................................... 168 
Figure 6.10 Response to inverter switch on, start-up, MPPT, and series arcing in a three 
phase PV system (12 panels)......................................................................................... 175 
Figure 6.11 Response to several intermittent series arcing events generated at the 
middle of the PV array followed by a sustained series arcing in a three phase PV system 
(12 panels) ..................................................................................................................... 175 
 
XXII 
 
Figure 6.12 Response to partial shading and series arcing in a three phase PV system 
(12 panels) ..................................................................................................................... 176 
Figure 6.13 Response to series arcing generated at very low irradiance level in a three 
phase PV system (12 panels)......................................................................................... 176 
Figure 6.14 Response to inverter switch on, start-up, MPPT, and series arcing in a three 
phase PV system (6 panels)........................................................................................... 177 
Figure 6.15 Response to several intermittent series arcing events generated at the 
middle of the PV array followed bya sustained series arcing in a three phase PV system 
(6 panels) ....................................................................................................................... 177 
Figure 6.16 Response to partial shading and series arcing in a three phase PV system (6 
panels) ........................................................................................................................... 178 
Figure 6.17 Response to series arcing generated at the middle of the PV array at low 
irradiance level in a three phase PV system (6 panels) ................................................. 178 
Figure 6.18 Response to inverter switch on, start-up, MPPT, and series arcing in a 
single phase PV system (4 panels) ................................................................................ 179 
Figure 6.19 Response to several intermittent series arcing events followed by a 
sustained series arcing in a single phase PV system (4 panels) .................................... 180 
Figure 6.20 Response to partial shading and series arcing in a single phase PV system 
(4 panels) ....................................................................................................................... 180 
Figure 6.21 Response to series arcing generated at the middle of the PV array at low 
irradiance level in a single phase PV system (4 panels) ............................................... 181 
 
 
XXIII 
 
List of Tables 
Table 2.1 Summary of DC arc fault models for simulation ........................................... 25 
Table 2.2 Summary of FFT based detection methods .................................................... 30 
Table 2.3 Summary of STFT based detection methods ................................................. 31 
Table 2.4 Summary of wavelet transform based detection methods .............................. 35 
Table 2.5 Summary of statistical analysis based detection methods .............................. 38 
Table 2.6 Summary of model based detection methods ................................................. 39 
Table 2.7 Summary of ML detection methods ............................................................... 46 
Table 2.8 Summary of other detection methods ............................................................. 51 
Table 2.9 Comparison of detection methods for DC arc fault detection ........................ 51 
Table 3.1 Experimental conditions for characteristics study of DC series arc fault ...... 59 
Table 3.2 Wavelet-packet entropy level for different arc phases (11A/200V) .............. 69 
Table 3.3 Wavelet-packet entropy level (Non-arc state) for different load current and 
source voltage.................................................................................................................. 72 
Table 3.4 Wavelet-packet entropy level (Arc state) for different load current and source 
voltage ............................................................................................................................. 72 
Table 3.5 Wavelet-packet entropy level for different gap distance (6.5A/158V) .......... 76 
Table 4.1 Activation functions used in this thesis .......................................................... 89 
Table 4.2 Structure and parameters of the optimal lightweight CNN .......................... 104 
Table 4.3 Influence of filter size on CNN performance ............................................... 105 
 
XXIV 
 
Table 4.4 Influence of number of filters on CNN performance ................................... 106 
Table 4.5 Influence of number of convolution layers on CNN performance ............... 106 
Table 4.6 Influence of fully connected layer settings on CNN performance ............... 107 
Table 4.7 Performance comparison of different CNNs ................................................ 108 
Table 4.8 Evaluation of different popular ML methods ............................................... 114 
Table 5.1 Specification of PV model JINKO (JKM350M-72) at STC ........................ 132 
Table 5.2 The architecture of different neural networks in DA-DCGAN .................... 136 
Table 5.3 Testing accuracy comparison for target domain series arc fault detection .. 139 
Table 6.1 Description of datasets for LTCNN-ADA case study .................................. 154 
Table 6.2 Symbols and descriptions ............................................................................. 155 
Table 6.3 Performance comparison of different algorithms on different transfer tasks 
using the same dataset ................................................................................................... 172 
 
1 
1. Introduction 
1.1. Background and Research Motivation 
With the development of new technology and increasing concerns about 
environmental pollution, renewable energies come into the world stage and gradually 
substitute traditional fossil fuels. Solar energy is one eminent source. For example, as of 
30 September 2019, there are more than 2.2 million photovoltaic (PV) installations with 
a combined capacity of more than 13.9 GW in Australia [1]. Solar power development 
is increasing worldwide, and residential rooftop solar panels and grid-connected PV 
generation will play an important role to support the main loads and micro-grids. The 
increasing amount of PV systems coupled with increased operating DC voltage level 
has a high potential of creating DC arc faults (utility-scaled PV solar farms typically 
produce voltage between 600 and 1500 volts, and typical building PV systems produce 
voltage between 120 and 600 volts in the USA) [2], [3]. Because the deterioration of 
cables, connectors, conductors, and other system components caused by long-time 
weathering and aging effect, without adequate scheduled maintenance, the possibility of 
DC arc occurrence in PV systems is sharply going up [4]. 
AC arc fault has been extensively investigated for decades resulting in many 
detection methods proposed. Extensive literature reviews on high impedance and AC 
arc faults can be found in [5]–[8]. There are various physical and electrical 
characteristics that can be exploited for AC arc fault detection, such as intermittency of 
the arc, asymmetry in the current waveform, current buildup, randomness of the current 
magnitude, high frequency variation in the current waveform, etc. [6]. AC arcs are more 
likely to produce some distortions around the zero-crossing points in the current 
waveform, because it periodically crosses zero points and the voltage near zero-crossing 
 
2 
points is too small to break down the air gap. Therefore, the AC arcs periodically result 
in ignition and re-ignition in each power frequency cycle, which introduces many 
harmonics into the signal. This kind of feature is commonly used as one of the key 
characteristics for AC arc fault detection. On the other hand, DC arcs tend to be more 
sustainable and difficult to be extinguished compared to AC arcs due to the absence of 
natural alternating current at zero crossing. Furthermore, the absence of many 
distinguishable features (e.g. distortions around zero-crossing points) makes DC arcs 
more difficult to be detected. Examples of AC and DC arc fault current waveforms, 
including their key characteristics in the time domain, are illustrated in Figure 1.1. 
Distortions around 
zero-crossing points
Asymmetry
Intermittence
A sudden current drop
Before arcing After arcing
More variations
(a)
(b) 
Figure 1.1 Examples of arc current waveforms: (a) AC arc fault; (b) DC arc fault 
 
3 
Ironically, there is less scheduled maintenance for PV systems because the PV 
components are considered reliable. Therefore, arc faults are often ignored. But they 
could last for a long time. The high-temperature plasma generated by a sustained arc 
could produce significant amount of thermal energy [9], which can lead to severe 
damage to the PV systems such asshown in Figure 1.2 and Figure 1.3 [10], [11]. 
(a)
(b)
Burned PV panels
Burned PV conductors
 
Figure 1.2 Fire due to DC arc faults in Bakersfield, USA in 2009 [10]: (a) Overview of 
burned PV panels; (b) Burned PV conductors 
DC arc faults are becoming more common incidents in PV systems nowadays. 
They have caused many catastrophic fires around the world in the past decade. The 
events in Bakersfield (USA) and Mount Holly (USA) in 2009 and 2011, respectively, 
raised attention and triggered the formation and improvement of the relevant standards 
and codes [10], [12]. Many solar fire incidents were reported recently across Australia, 
such as Haddon in Victoria, Cairns in Queensland, Felixstow in South Australia, New 
 
4 
Lambton in New South Wales, etc [11]. 
Although arc faults could be mitigated by improving the construction and design of 
the PV system and its components, implementing a detecting device that continuously 
monitors the arc faults would dramatically increase the system reliability and reduce the 
fire risk. The 2011 National Electrical Code requires all rooftop PV systems of DC 
operating voltage above 80 volts equip with series arc fault circuit interrupters (AFCIs), 
and then the requirement extends to all types of PV systems greater than 80 volts in 
2014 to reduce the fire hazard due to arc faults [13]. UL-1699B Outline was initially 
introduced to fulfil these requirements in 2013 [14], and the first edition of the formal 
UL-1699B Standard became available as a guideline to develop and test arc fault 
detectors (AFDs) and AFCIs from August 2018 [15]. All these factors call for the 
development of reliable, robust, cost-effective, and intelligent DC arc fault detection 
algorithms for PV system protection. 
 
Figure 1.3 Fires due to DC arc faults in Australia in recent years [11] 
 
5 
Series arc fault is considered to be more dangerous because it is more difficult to be 
detected as compared to parallel arc fault [16]. Unlike the parallel arc fault, the series 
arc fault increases the circuit impedance so the load current level will go down, and thus 
cannot reach 156% of the maximum normal current defined by National Electrical Code 
to melt the overcurrent protection fuse [17], [18]. Furthermore, it has been shown in [19] 
that the overwhelming majority of faults that result in fires in PV systems are series arc 
fault and grounding fault; other types of parallel arc faults in PV systems are less likely 
to happen. The undetected grounding faults can contribute to parallel arc faults. 
However, it is better to prevent this type of fault by improving the detection and 
protection of grounding faults. Because of this, the relevant standards and codes are 
mainly focused on detection of series arc faults in PV systems. Therefore, series arc 
faults will be the focus of this thesis. 
1.2. Summary of Research Contributions 
The contributions by the author from this research are: 
• A comprehensive review of DC arc faults in PV systems and their state-of-the-
art diagnosis methods is carried out. Useful information and technical details 
of applied methods are summarised. The capabilities and limitations of 
different detection methods are presented and discussed. 
• The characteristics of DC series arc fault are investigated. A time-frequency 
domain method based on wavelet packet decomposition (WPD) and entropy 
theory is developed to extract consistent patterns characterising the arc current. 
• A novel deep learning (DL) DC series arc fault detection method for PV 
systems using convolutional neural network (CNN) is developed. To the best 
of the author’s knowledge, DL is applied and its feasibility is investigated for 
 
6 
the first time in this specific application. Different machine learning (ML) 
methods, including conventional ML and DL, are evaluated and compared 
using the same experimental datasets collected in the laboratory to examine 
their effectiveness. Technical roadblocks preventing intelligent DC series arc 
fault detection from being applied to real-world industry are identified, 
including imbalanced dataset, small dataset, data inconsistency, unlabelled 
dataset, model complexity and real-time capability, and model interpretability. 
Detailed recommendations and potential solutions to these challenges are 
provided. 
• An optimal lightweight CNN structure is developed and comprehensively 
tested. It can achieve the same level of detection accuracy while the model 
complexity is substantially reduced as compared to well-known deep CNN 
models. Thus, the proposed method is more suitable for cost-effective real-
time deployment in resource-constraint edge devices. 
• A novel and effective methodology, namely domain adaptation and deep 
convolutional generative adversarial network (DA-DCGAN), is proposed. It 
overcomes the performance degradation of DL-based detection algorithms 
caused by data inconsistency between the source-domain data (e.g. laboratory) 
used during the development and the target-domain data (e.g. real PV systems) 
encountered in operation in the field. It also tackles the problem of lack of 
faults data in the target domain, even the extreme case of no target-domain 
fault data available. 
• A novel framework, namely lightweight transfer convolutional neural network 
with adversarial data augmentation (LTCNN-ADA), is developed to address 
the same problem, i.e. data inconsistency and insufficient data. It optimises the 
 
7 
detection performance of DL-based algorithms where only a limited amount of 
target-domain fault data is available. 
1.3. Thesis Organisation 
The organisation of this thesis is as follows: 
Chapter 1 - Introduction. This chapter provides the background context, research 
motivation, novel contributions of this research, and achievements by the author. 
Chapter 2 - Literature Review. This chapter presents an in-depth review of DC arc 
faults in PV systems and their diagnosis methods. 
Chapter 3 – Characteristics Study on DC Series Arc Fault. This chapter examines 
the characteristics of DC series arc fault in detail. The impact of the load current, source 
voltage, and arc gap distance on the static characteristics and high frequency 
characteristics are investigated. Wavelet packet entropy is applied to analyse the arc 
current signals. 
Chapter 4 – DC Series Arc Fault Detection in PV Systems using Deep Learning. 
This chapter presents a DL-based detection algorithm using CNN. A lightweight CNN 
structure is designed to achieve a good balance between model complexity and 
detection accuracy. A comparative study among different conventional ML methods 
and DL methods is performed using the same experimental dataset. Finally, challenges 
in applying ML methods for practical PV series arc fault detection are identified and 
potential solutions are provided. 
Chapter 5 - Intelligent DC Series Arc Fault Detection in PV Systems using DA-
DCGAN without Target-Domain Fault Data. This chapter presents a novel 
methodology based on DL for DC series arc fault detection in PV systems when the 
 
8 
fault data from the target PV system is not available. The proposed method is 
implemented in an embedded system and validated in real-time under different test 
conditions. 
Chapter 6 - Intelligent DC Series Arc Fault Detection in PV Systems using 
LTCNN-ADA with Limited Target-Domain Fault Data. This chapter presents a 
novel framework based on DL for DC series arc fault detection in PV systems when the 
fault data from the target PV system is available but limited. Its effectiveness is 
validated through comprehensive off-line analysis, comparative studies with other 
recent work, and real-time experiments under different test conditions. 
Chapter 7 - Conclusion and Future Work. This chapter gives a summary of the major 
research outcomes of this research and provides suggestions for furtherwork. 
1.4. List of Publications 
Journal Papers 
1. Shibo Lu, Rui Ma, Tharmakulasingam Sirojan, B.T. Phung, and Daming 
Zhang, “Lightweight Transfer Nets and Adversarial Data Augmentation for 
Photovoltaic Series Arc Fault Detection with Limited Fault Data”, Submitted 
to Solar Energy, 2020. 
2. Shibo Lu, Hua Chai, Animesh Sahoo, and B. T. Phung, “Condition 
Monitoring based on Partial Discharge Diagnostics using Machine Learning 
Methods: A Comprehensive State-of-the-Art Review,” accepted for 
publication in IEEE Transactions on Dielectrics and Electrical Insulation, 3 
July, 2020. 
3. Shibo Lu, Tharmakulasingam Sirojan, B. T. Phung, Daming Zhang, and 
Eliathamby Ambikairajah, “DA-DCGAN: An Effective Methodology for DC 
 
9 
Series Arc Fault Diagnosis in Photovoltaic systems,” IEEE Access, vol. 7, pp. 
45831-45840, April 2019. 
4. Tharmakulasingam Sirojan, Shibo Lu, B. T. Phung, Daming Zhang, and 
Eliathamby Ambikairajah, “Sustainable Deep Learning at Grid Edge for Real-
time High Impedance Fault Detection,” IEEE Transactions on Sustainable 
Computing, doi: 10.1109/TSUSC.2018.2879960 
5. Shibo Lu, B. T. Phung and Daming Zhang, “A Comprehensive Review on DC 
Arc Faults and Their Diagnosis Methods in Photovoltaic Systems,” Renewable 
& Sustainable Energy Reviews, vol. 89, pp.88-98, June 2018. 
Conference Papers 
1. Shibo Lu, Animesh Sahoo, Rui Ma, and B. T. Phung, “DC Series Arc Fault 
Detection using Machine Learning in Photovoltaic Systems: Recent 
Developments and Challenges,” International Conference on Condition 
Monitoring (CMD), Phuket, Thailand, 25-28 Oct. 2020. 
2. Tharmakulasingam Sirojan, Shibo Lu, B. T. Phung, Daming Zhang, and 
Eliathamby Ambikairajah, “Embedded Edge Computing for Smart Meter Data 
Analytics,” International Conference on Smart Energy Systems and 
Technologies (SEST), Porto, Portugal, 9-11 Sep. 2019. 
3. Shibo Lu, B. T. Phung, Daming Zhang, and Hua Chai, “An Experimental Study 
of Low-Current DC Series Arc Faults for Condition Monitoring Purpose,” 
International Conference and Exhibition on Electricity and Distribution 
(CIRED), Madrid, Spain, 3-6 June 2019. 
4. Hua Chai, Shibo Lu, B. T. Phung, Daming Zhang “Comparative Study of 
Partial Discharge Localization based on UHF Detection Methods,” International 
Conference and Exhibition on Electricity and Distribution (CIRED), Madrid, 
 
10 
Spain, 3-6 June 2019. 
5. Miao Li, Shibo Lu, Daming Zhang, and B. T. Phung “Series Arc Fault 
Detection in DC Microgrid Using Hybrid Detection Method,” Annual 
Conference of the IEEE Industrial Electronics Society (IECON), Washington 
D.C., USA, 21-23 Oct. 2018. 
6. Tharmakulasingam Sirojan, Shibo Lu, B. T. Phung, Daming Zhang and 
Eliathamby Ambikairajah, “High Impedance Fault Detection by Convolutional 
Deep Neural Network,” IEEE International Conference on High Voltage 
Engineering and Application (ICHVE), Athens, Greece, 10-13 Sept. 2018. 
7. Shibo Lu, B. T. Phung, and Daming Zhang, “Study on DC Series Arc Fault in 
Photovoltaic systems for Condition Monitoring Purpose,” Australasian 
Universities Power Engineering Conference (AUPEC), Melbourne, Australia, 
Nov. 2017. 
8. Shibo Lu, Daming Zhang and B. T. Phung, “Arcing Fault Detection in the 
Scenario with Renewable Energy Generation,” Annual Conference of the IEEE 
Industrial Electronics Society (IECON), Beijing, China, Oct/Nov. 2017. 
9. Ruihao Song, Shibo Lu, Tharmakulasingam Sirojan, and B.T. Phung, “Power 
Quality Monitoring of Single-Wire-Earth-Returned Distribution Feeders,” 
International Conference on High Voltage Engineering and Power Systems 
(ICHVEPS), Bali, Indonesia, Oct. 2017. 
Patent 
1. Tharmakulasingam Sirojan, Shibo Lu, B. T. Phung, and Eliathamby 
Ambikairajah, “Apparatus and process for real-time detection of high-
impedance faults in power lines,” No. PCT/AU2019/051219, Nov. 2019. 
 
 
11 
2. Literature Review 
2.1. Introduction 
There are two surveys related to DC arc faults in PV systems [20], [21]. Yao et al. 
briefly reviewed a limited number of arc fault detection techniques for DC systems, 
including PV systems [20]. Alam et al. conducted a comprehensive survey on detection 
and mitigation techniques of catastrophic faults, such as line-line faults, ground faults, 
and arc faults in PV systems [21]. However, both studies did not present arc fault 
diagnosis techniques for PV systems in detail. Moreover, both surveys did not discuss 
the capabilities and limitations of different detection algorithms, such as the required 
sampling frequency and computation load, which are very useful when they are to be 
implemented in microprocessor controllers. 
In this chapter, the primary objective is to present the state-of-the-art detection 
methods for diagnosis of DC arc faults in PV systems. The capabilities and limitations 
of different methods are discussed, compared, and summarised. Besides, in order to 
develop effective detection algorithms, it is of significance to know the arc fault 
characteristics and mechanisms. Since carrying out field testing is difficult, costly, time-
consuming, and still not exhaustive (covering all fault types), precisely modelling arc 
faults becomes more critical. Therefore, the different types of DC arc fault model and 
their capability for PV systems application are presented and compared. Furthermore, 
the development trend of detection methods in PV systems is discussed at the end. 
2.2. DC Arc in Photovoltaic Systems 
2.2.1. Photovoltaic Systems Structure and Arc Hazards 
 
12 
The basic structure and the procedure of forming the solar array are shown in 
Figure 2.1. The solar cells are connected together to form a solar module, and solar 
modules are usually connected in series to form a solar string to increase the DC 
operating level of the whole system. These solar strings could be connected in parallel 
to increase the DC current level, which accordingly increases the power generation 
capability [22], [23]. 
 
Figure 2.1 Typical structure of PV systems 
There are mainly two types of possible arc faults in PV systems: series and parallel 
arc faults, the latter includes grounding arc fault. Parallel and grounding arc faults often 
draw a large fault current because of the sizeable different potential, which is easier to 
be detected by traditional protection devices [17], [18]. In contrast, due to its inherent 
nature, the series arc fault current (lower than normal operating current level) will not 
be sufficient to melt the fuse or activate the overcurrent protection devices. Parallel arc 
faults except grounding arc fault are less likely to happen in PV systems compared to 
 
13 
series arc fault [19]. Therefore, in PV systems, the relevant standards and codes are 
mainly focused on series arc fault detection and protection. 
Series arcs can be created across small gaps between two connecting terminals such 
as busbar-ribbon connection in PV modules and connection in a combiner box as shown 
in Figure 2.2. The lack of scheduled maintenance, aging effect, weather effect (e.g. 
corrosion caused by rain), mechanical damage induced by wind, animal bites, improper 
wiring can cause bad joints to occur. Bad joints decrease the cross-section area, 
effectively increase the connection resistance, and significantly increase the heat loss. It 
introduces more thermal stress due to the higher operating temperature, and accelerates 
the deterioration in connections, which leads to loose connections [24]. After that, a 
small gap may develop between two connecting terminals without interrupting the 
current flow. When the electric field across the gap exceeds approximately 3 V/μm (the 
breakdown strength depends on surrounding environment), the air in the gap starts to 
ionise and arc plasma is developed, which will finally form a series arc. The oxygen 
flow into the plasma stream further sustains the arc discharge. The gap distance for 
seriesarcs is typically less than few millimetres. The fault current magnitude is low, 
typically few amps, at the solar cell, the module, or the string level. However, it can 
reach several hundred amps at the combiner box and more than a thousand amps at the 
DC side of the inverter in large PV systems. 
Parallel arcs have a similar mechanism as series arc faults. They can be developed 
between two conductors in the same string, two conductors of two different strings, and 
conductor and grounding point as shown in Figure 2.2 [25]. The parallel arc faults are 
mainly caused by degradation and breakdown of insulation due to various reasons such 
as animal bites, mechanical damage, and aging effect. This is because most cables and 
wires in PV systems are exposed to the open environment (no protective enclosure) [26]. 
 
14 
Both the fault current level and the gap width are larger than those of series arc fault. 
 
Figure 2.2 Example of possible locations where arcing may occur in PV systems 
Due to the nature of PV modules and design of the PV systems, there could be 
thousands of connectors and a significant amount of cables in the system as shown in 
Figure 2.1 and Figure 2.2. Every connecting point can create an arc fault, which 
substantially increases the possibility of arc fault occurrence, especially the series arc 
fault. Because of the inverter operation for maximum power point tracking (MPPT), the 
fault current level can return to normal while the arcing fault still exists, which makes it 
more challenging in DC arc fault detection. It has been shown that only 0.4 mm2 of arc 
area could cause ignition of surrounding materials and burn off the metal coating within 
2 seconds [27]. In [28], it is found that, at a radius of 10 mm, the surface ignition time 
for plastics are 4 s, 1 s, and 0.3 s with arc power of 200 W, 400 W, and 800 W, 
respectively. As a result, the heat energy generated by undetected arc over a long time 
can lead to serious damage to system components, and it presents severe threats to 
 
15 
system stability and human safety [29]. UL-1699B Outline requires every arc fault 
circuit interrupter (AFCI) or arc fault detector (AFD) pass the arc test with the power of 
300-900W [14]. However, low power arcs are often the case at string level, which can 
also lead to a fire. Therefore, 100W arc fault test has been recommended to be added 
into UL-1699B [30]. 
The formal UL-1699B Standard became available from August 2018 [15]. The 
total response time 𝑇𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒 of AFCI and AFD, with a limit of 2.5 seconds, is defined 
in (2.1): 
𝑇𝑟𝑒𝑠𝑝𝑜𝑛𝑠𝑒 = max (
750 (𝑗𝑜𝑢𝑙𝑒)
𝐼𝑎𝑟𝑐𝑉𝑎𝑟𝑐
, 2.5) (2.1) 
Besides the fire hazard, arc faults will have severe impact on the operating points of 
the PV system [31]. The presence of series arc faults will inject extra impedance in the 
PV system, which can cause mismatch loss, heating loss, and decreasing fill factor. 
Thus, it effectively decreases the efficiency of PV system. Although the operation of the 
solar inverter can keep tracking the maximum operating point, the overall output of the 
system is essentially going down. Therefore, it is important to protect PV systems from 
such a situation. 
2.2.2. Challenges to Detect DC Arc Faults 
Majority of AC arc detection algorithms mainly rely on the most iconic feature: the 
“shoulder” in the arc current waveform [5]. However, the DC arcs do not have zero 
crossing points, and the absence of “shoulder” makes DC arc more sustainable and 
difficult to be detected [32]. In order to develop an effective detection algorithm and 
differentiate DC arc fault from normal condition, besides studying the DC arc 
characteristics to extract the arc features, other disturbances from normal operations 
 
16 
must be identified. The different disturbances can cause AFCI and AFD fail to response 
or make a wrong decision to interrupt the electricity delivery. As shown in Figure 2.3, 
there are both internal and external disturbances that can mask or modify the arc 
features: 
• The intensity of arc noise and variation would be affected by several factors, such 
as different electrode materials, geometries, current and voltage levels, load type, 
and source type. For example, when the loop current of non-arc state is the same, 
the arc discharge tends to be more stable at higher voltage level, which results in a 
smaller fault current standard deviation (less variation of current) [33]. Another 
example is that the frequency components of fault current with rounded tip 
electrodes are clearly lower than that with flat tip electrodes, because the arc is 
symmetrically created through the radial centre of electrodes [34]. 
• The long PV cables would act as an antenna to pick up radio-frequency noise 
especially in frequency band of 100 kHz to 500 MHz. Also, different system 
topologies and sizes would affect the radio-frequency response, and thus it may 
modify the arc noise profile [35]. PV cables also have inductive component, which 
acts as a low pass filter, decreasing the noise level increased by arc fault [36]. 
• The electronics loads, such as DC/DC converter and DC/AC inverter would 
produce high frequency electromagnetic interference noise in the circuit, which 
could cause nuisance tripping [37]. 
• Crosstalk effect may introduce radio-frequency noise, causing tripping [38]. 
• The transformer-less inverters are used in some PV system applications, and AC 
noise will be injected into the circuit from the AC side (50 Hz grid fundamental 
frequency component and its harmonics). 
• Step change caused by load shifting (turn off the converter or inverter), fast moving 
 
17 
cloud, partial shading operation, mechanical vibration induced by wind, system 
shutdown, and power adjustment by inverter exhibit similar behaviours to arc faults 
[31], [39], [40]. 
 
Figure 2.3 Typical disturbance sources 
2.3. DC Arc Models 
The arc is a very complicated and chaotic physical phenomenon. The behaviours of 
arc vary from different voltage and current levels, environmental conditions (i.e. 
ambient temperature, pressure, and moisture level), and arc length. These factors make 
its physical constants difficult to be defined. 
Nowadays, the majority of arc studies are based on observation of experiments and 
analysis of acquired data sets, and scholars mainly use the V-I curve to characterise this 
complex phenomenon. The quasi-static arcing V-I characteristic with different arc 
length is shown in Figure 2.4. It can be seen that in the lower current region, the smaller 
the current, the larger the voltage, where the power of the arc (𝑃𝑎𝑟𝑐 = 𝑉𝑎𝑟𝑐𝐼𝑎𝑟𝑐) tends to 
remain the same; while in the higher current region, due to the magnetic effect, the 
 
18 
voltage approximately remains unchanged with the increasing current [41], [42]. The 
transition point line demarcates those two regions. Note that the arc length is not 
equivalent to the gap width. In real situations, the arc length could be significantly 
larger than the gap width unless the arc current is low, and the gap width is short. More 
comprehensive study on DC arc fault can be found in [43] and [44]. 
 
Figure 2.4 V-I characteristic of arc 
Acquiring arc fault data from real PV systems is too costly and difficult as the 
features of arc fault vary for different fault locations and current levels. Therefore, an 
alternative approach is using computer simulation of a realistic arc fault model to 
develop and verify the arc fault detection algorithms, and it can indeed reduce the cost. 
There are three main types of arc model that can be used for simulation: models 
based on physical principles, traditional V-I empirical models obtained from 
measurement data, and heuristic models. A summary of arc models that can be used for 
arc fault simulation in PV systems is presented in Table 2.1. 
 
19 
2.3.1. Physics-based

Mais conteúdos dessa disciplina