修车大队一品楼qm论坛51一品茶楼论坛,栖凤楼品茶全国楼凤app软件 ,栖凤阁全国论坛入口,广州百花丛bhc论坛杭州百花坊妃子阁

oe1(光电查) - 科学论文

10 条数据
?? 中文(中国)
  • [IEEE 2019 International Conference on Robotics and Automation in Industry (ICRAI) - Rawalpindi, Pakistan (2019.10.21-2019.10.22)] 2019 International Conference on Robotics and Automation in Industry (ICRAI) - Low Cost 2D Laser Scanner Based Indoor Mapping and Classification System

    摘要: Although the field of automatic speaker or speech recognition has been extensively studied over the past decades, the lack of robustness has remained a major challenge. Feature warping is a promising approach and its effectiveness significantly depends on the relative positions of each of the features in a sliding window. However, the relative positions are changed due to the non-linear effect of noise. Aiming at the problem, this paper takes the advantage of ranking feature, which is obtained directly by sorting a feature sequence in descending order, to propose a method. It first labels the central frame in a sliding window as speech or noise dominant (‘‘reliable’’ or ‘‘unreliable’’). In the unreliable case, the ranking of the central frame is estimated. Subsequently, the estimated ranking is mapped to a warped feature using a desired target distribution for recognition experiments. Through the theoretical analysis and experimental results, it is found that autocorrelation of a ranking sequence is larger than that of the corresponding feature sequence. What is more, rank correlation is not easily influenced by abnormal data or data that are highly variable. Thus, this paper deals with a ranking sequence rather than a feature sequence. The proposed feature enhancement approach is evaluated in an open-set speaker recognition system. The experimental results show that it outperforms missing data method based on linear interpolation and feature warping in terms of recognition performance in all noise conditions. Furthermore, the method proposed here is a feature-based method, which may be combined with other technologies, such as model-based, scores-based, to enhance the robustness of speaker or speech recognition system.

    关键词: Robustness,ranking feature,rank correlation,open-set speaker recognition,autocorrelation,feature warping,missing data method

    更新于2025-09-23 15:21:01

  • [IEEE 2019 IEEE 8th International Conference on Advanced Optoelectronics and Lasers (CAOL) - Sozopol, Bulgaria (2019.9.6-2019.9.8)] 2019 IEEE 8th International Conference on Advanced Optoelectronics and Lasers (CAOL) - The Fast Modification of Evolutionary Bioinspired Cat Swarm Optimization Method

    摘要: In this paper, we propose a novel probabilistic method for the task of text-independent speaker identification (SI). In order to capture the dynamic information during SI, we design super-mel-frequency cepstral coefficients (MFCCs) features by cascading three neighboring MFCCs frames together. These super-MFCC vectors are utilized for probabilistic model training such that the speaker’s characteristics can be sufficiently captured. The probability density function (PDF) of the aforementioned super-MFCCs features is estimated by the recently proposed histogram transform (HT) method. To recede the commonly occurred discontinuity problem in multivariate histograms computing, more training data are generated by the HT method. Using these generated data, a smooth PDF of the super-MFCCs vectors is obtained. Compared with the typical PDF estimation methods, such as Gaussian mixture model, promising improvements have been obtained by employing the HT-based model in SI.

    关键词: Speaker identification,Gaussian mixture model,mel-frequency cepstral coefficients,histogram transform model

    更新于2025-09-23 15:21:01

  • [IEEE 2019 8th International Conference on Renewable Energy Research and Applications (ICRERA) - Brasov, Romania (2019.11.3-2019.11.6)] 2019 8th International Conference on Renewable Energy Research and Applications (ICRERA) - A Study of Introduction of the Photovoltaic Generation System to Conventional Railway

    摘要: This paper presents a voice conversion (VC) method that utilizes the recently proposed probabilistic models called recurrent temporal restricted Boltzmann machines (RTRBMs). One RTRBM is used for each speaker, with the goal of capturing high-order temporal dependencies in an acoustic sequence. Our algorithm starts from the separate training of one RTRBM for a source speaker and another for a target speaker using speaker-dependent training data. Because each RTRBM attempts to discover abstractions to maximally express the training data at each time step, as well as the temporal dependencies in the training data, we expect that the models represent the linguistic-related latent features in high-order spaces. In our approach, we convert (match) features of emphasis for the source speaker to those of the target speaker using a neural network (NN), so that the entire network (consisting of the two RTRBMs and the NN) acts as a deep recurrent NN and can be fine-tuned. Using VC experiments, we confirm the high performance of our method, especially in terms of objective criteria, relative to conventional VC methods such as approaches based on Gaussian mixture models and on NNs.

    关键词: recurrent temporal restricted Boltzmann machine (RTRBM),voice conversion,speaker specific features,recurrent neural network,Deep Learning

    更新于2025-09-23 15:19:57

  • [IEEE 2019 IEEE 46th Photovoltaic Specialists Conference (PVSC) - Chicago, IL, USA (2019.6.16-2019.6.21)] 2019 IEEE 46th Photovoltaic Specialists Conference (PVSC) - Developing a web based PV simulation platform (targeting at machine learning combined with advanced device and process simulation to support process optimization)

    摘要: In this paper, we present a voice conversion (VC) method that does not use any parallel data while training the model. VC is a technique where only speaker-specific information in source speech is converted while keeping the phonological information unchanged. Most of the existing VC methods rely on parallel data—pairs of speech data from the source and target speakers uttering the same sentences. However, the use of parallel data in training causes several problems: 1) the data used for the training are limited to the pre-defined sentences, 2) the trained model is only applied to the speaker pair used in the training, and 3) mismatches in alignment may occur. Although it is, thus, fairly preferable in VC not to use parallel data, a nonparallel approach is considered difficult to learn. In our approach, we achieve nonparallel training based on a speaker adaptation technique and capturing latent phonological information. This approach assumes that speech signals are produced from a restricted Boltzmann machine-based probabilistic model, where phonological information and speaker-related information are defined explicitly. Speaker-independent and speaker-dependent parameters are simultaneously trained under speaker adaptive training. In the conversion stage, a given speech signal is decomposed into phonological and speaker-related information, the speaker-related information is replaced with that of the desired speaker, and then voice-converted speech is obtained by mixing the two. Our experimental results showed that our approach outperformed another nonparallel approach, and produced results similar to those of the popular conventional Gaussian mixture models-based method that used parallel data in subjective and objective criteria.

    关键词: unsupervised training,speaker adaptation,Restricted Boltzmann machine,voice conversion

    更新于2025-09-23 15:19:57

  • [IEEE 2019 Photonics & Electromagnetics Research Symposium - Fall (PIERS - Fall) - Xiamen, China (2019.12.17-2019.12.20)] 2019 Photonics & Electromagnetics Research Symposium - Fall (PIERS - Fall) - Parallel Computation Technology for Distributed Optical Fiber Sensing System

    摘要: The gradual changes that occur in the human voice due to aging create challenges for speaker verification. This study presents an approach to calibrating the output scores of a speaker verification system using the time interval between comparison samples as additional information. Several functions are proposed for the incorporation of this time information, which is viewed as aging information, in a conventional linear score calibration transformation. Experiments are presented on data with short-term aging intervals ranging between 2 months and 3 years, and long-term aging intervals of up to 30 years. The aging calibration proposal is shown to offset the decreased discrimination and calibration performance for both short- and long-term intervals, and to extrapolate well to unseen aging intervals. Relative reductions in C(cid:2)(cid:2)r (cost of log-likelihood ratio) of 1–4% and 10–43% are obtained at short- and long-term intervals, respectively. Assuming that a system has knowledge of the time interval between samples under comparison, this approach represents a straightforward means of compensating for the detrimental impact of aging on speaker verification performance.

    关键词: speaker variability,quality measures,Aging,calibration,speaker verification

    更新于2025-09-19 17:13:59

  • [IEEE 2019 Research, Invention, and Innovation Congress (RI2C) - Bangkok, Thailand (2019.12.11-2019.12.13)] 2019 Research, Invention, and Innovation Congress (RI2C) - A Single-Stage High-Power-Factor LED Driver based on Interleaved ZCDS Class-E Rectifier

    摘要: This paper presents a voice conversion (VC) method that utilizes the recently proposed probabilistic models called recurrent temporal restricted Boltzmann machines (RTRBMs). One RTRBM is used for each speaker, with the goal of capturing high-order temporal dependencies in an acoustic sequence. Our algorithm starts from the separate training of one RTRBM for a source speaker and another for a target speaker using speaker-dependent training data. Because each RTRBM attempts to discover abstractions to maximally express the training data at each time step, as well as the temporal dependencies in the training data, we expect that the models represent the linguistic-related latent features in high-order spaces. In our approach, we convert (match) features of emphasis for the source speaker to those of the target speaker using a neural network (NN), so that the entire network (consisting of the two RTRBMs and the NN) acts as a deep recurrent NN and can be fine-tuned. Using VC experiments, we confirm the high performance of our method, especially in terms of objective criteria, relative to conventional VC methods such as approaches based on Gaussian mixture models and on NNs.

    关键词: recurrent temporal restricted Boltzmann machine (RTRBM),speaker specific features,voice conversion,Deep Learning,recurrent neural network

    更新于2025-09-19 17:13:59

  • [IEEE 2019 Photonics & Electromagnetics Research Symposium - Fall (PIERS - Fall) - Xiamen, China (2019.12.17-2019.12.20)] 2019 Photonics & Electromagnetics Research Symposium - Fall (PIERS - Fall) - Improved Equivalent Circuit Models of THZ Quantum Cascade Lasers for SPICE Simulation

    摘要: This paper presents a voice conversion (VC) method that utilizes the recently proposed probabilistic models called recurrent temporal restricted Boltzmann machines (RTRBMs). One RTRBM is used for each speaker, with the goal of capturing high-order temporal dependencies in an acoustic sequence. Our algorithm starts from the separate training of one RTRBM for a source speaker and another for a target speaker using speaker-dependent training data. Because each RTRBM attempts to discover abstractions to maximally express the training data at each time step, as well as the temporal dependencies in the training data, we expect that the models represent the linguistic-related latent features in high-order spaces. In our approach, we convert (match) features of emphasis for the source speaker to those of the target speaker using a neural network (NN), so that the entire network (consisting of the two RTRBMs and the NN) acts as a deep recurrent NN and can be fine-tuned. Using VC experiments, we confirm the high performance of our method, especially in terms of objective criteria, relative to conventional VC methods such as approaches based on Gaussian mixture models and on NNs.

    关键词: recurrent temporal restricted Boltzmann machine (RTRBM),recurrent neural network,voice conversion,speaker specific features,Deep Learning

    更新于2025-09-19 17:13:59

  • [IEEE 2019 Compound Semiconductor Week (CSW) - Nara, Japan (2019.5.19-2019.5.23)] 2019 Compound Semiconductor Week (CSW) - Relative intensity noise of silicon-based quantum dot lasers

    摘要: In this paper, we present a voice conversion (VC) method that does not use any parallel data while training the model. VC is a technique where only speaker-specific information in source speech is converted while keeping the phonological information unchanged. Most of the existing VC methods rely on parallel data—pairs of speech data from the source and target speakers uttering the same sentences. However, the use of parallel data in training causes several problems: 1) the data used for the training are limited to the pre-defined sentences, 2) the trained model is only applied to the speaker pair used in the training, and 3) mismatches in alignment may occur. Although it is, thus, fairly preferable in VC not to use parallel data, a nonparallel approach is considered difficult to learn. In our approach, we achieve nonparallel training based on a speaker adaptation technique and capturing latent phonological information. This approach assumes that speech signals are produced from a restricted Boltzmann machine-based probabilistic model, where phonological information and speaker-related information are defined explicitly. Speaker-independent and speaker-dependent parameters are simultaneously trained under speaker adaptive training. In the conversion stage, a given speech signal is decomposed into phonological and speaker-related information, the speaker-related information is replaced with that of the desired speaker, and then voice-converted speech is obtained by mixing the two. Our experimental results showed that our approach outperformed another nonparallel approach, and produced results similar to those of the popular conventional Gaussian mixture models-based method that used parallel data in subjective and objective criteria.

    关键词: voice conversion,Restricted Boltzmann machine,unsupervised training,speaker adaptation

    更新于2025-09-19 17:13:59

  • Improved sound quality by using the exciter speaker in OLED panel

    摘要: The technology of using a vibrating OLED panel directly as a speaker was commercialized for the first time in 2017 in LG Display. This technology attaches a vibration actuator called an exciter to the back of the OLED panel, where it operates the OLED panel as a diaphragm of the speaker. With the development of the display maker, it developed from CRT to PDP, LCD, and LED. In the process, the image quality and appearance are leading the way. In previous studies, image and sound must be improved at the same time for easy information transmission. In addition, the concentration of watching is improved when the positions of the image and the sound match. The advantage of this technology is that it provides the viewer with direct sound coming from the panel, delivering clear audio when the sound and video appear in the same position. That gives the viewer a high intensity and maximizes the effect of information transfer. In this paper, we introduce research that generates sound directly from the exciter speaker to the panel. It also improves the sound quality of exciter speakers. Furthermore, it makes the left and right sounds independent from the sound using the exciter speakers.

    关键词: flat‐panel speaker,sound enhancement,OLED panel speaker,exciter speaker

    更新于2025-09-16 10:30:52

  • Towards the Realization of Graphene Based Flexible Radio Frequency Receiver

    摘要: We report on our progress and development of high speed flexible graphene field effect transistors (GFETs) with high electron and hole mobilities (~3000 cm2/V·s), and intrinsic transit frequency in the microwave GHz regime. We also describe the design and fabrication of flexible graphene based radio frequency system. This RF communication system consists of graphite patch antenna at 2.4 GHz, graphene based frequency translation block (frequency doubler and AM demodulator) and graphene speaker. The communication blocks are utilized to demonstrate graphene based amplitude modulated (AM) radio receiver operating at 2.4 GHz.

    关键词: AM radio receiver,demodulators,speaker,transistor,antenna,graphene

    更新于2025-09-04 15:30:14