Controlling the iSEE interface using speech recognition
Kevin Rue-Albrecht
MRC WIMM Centre for Computational Biology, University of Oxford, Oxford, OX3 9DS, UKkevinrue67@gmail.com
Federico Marini
Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), MainzCenter for Thrombosis and Hemostasis (CTH), Mainzmarinif@uni-mainz.de
Charlotte Soneson
Friedrich Miescher Institute for Biomedical Research, Basel, SwitzerlandSIB Swiss Institute of Bioinformaticscharlottesoneson@gmail.com
Aaron Lun
infinite.monkeys.with.keyboards@gmail.com7 October 2024
Source:vignettes/voice.Rmd
voice.Rmd
Compiled date: 2024-10-07
Last edited: 2018-11-29
License: MIT + file LICENSE
Feature
Using JavaScript, iSEE
applications can leverage
lightweight speech recognition libraries that react to specific vocal
commands (think “OK Google”, “Hey Siri”) and trigger updates of the UI
equivalent to one or more mouse or keyboard interaction with the UI
components (Rue-Albrecht et al. 2018).
Note: As we value privacy, this feature is disabled
by default: iSEE(..., voice=FALSE)
.
To keep the spoken commands reasonably short, only one panel may be under voice command at any one time. All spoken commands will affect the currently active panel, until a new panel is selected for voice command. See section Vocal commands available.
Implementation
We use the annyang lightweight JavaScript library to handle speech recognition and update Shiny reactive values in the same way as mouse and keyboard UI elements trigger panel updates.
Note that annyang requires an active internet connection, as it relies on the browser’s own speech recognition engine (see the annyang FAQ). For instance, in Google Chrome, this engine performs the recognition in the cloud.
Supported web browsers
Note that the speech recognition library that we use does not work with every web browser. We currently only validated this feature in Google Chrome. Please refer to the annyang FAQ for details.
Usage
Using the sce
object that we generated earlier,
enabling speech recognition is as simple as setting
voice=TRUE
below:
With voice=TRUE
, the lightweight JavaScript speech
recognition library annyang is loaded and activated in any web
browser tab that runs app
.
If your default browser is not compatible with the feature, or if you
work in RStudio, you can prevent the application from opening
in the default browser by setting launch.browser=FALSE
as
follows:
if (interactive()) {
shiny::runApp(app, port=1234, launch.browser=FALSE)
}
At that point, your R console should be displaying the address and
port where app
is running. In the example above, that would
be:
Listening on http://127.0.0.1:1234
Using a compatible browser, navigate to the indicated address and port. Note that when the web page opens, you may be prompted to allow the web browser to use your microphone, which you must accept to enable the functionality.
Vocal commands available
As a proof of concept, only a subset of spoken commands are currently implemented, compared to the full range of interactions possible using the mouse and keyboard.
Note that in the commands below, words in brackets are optional.
- “Show active panel”: shows a persistent notification displaying the name of the panel currently under vocal control.
- “Create
”: Adds a new panel of the requested type to the GUI and immediately takes vocal control of it. - “Remove <Reduced dimension plot 1>”: Removes the requested panel from the GUI. If the panel was under vocal control, clears vocal control.
- “Control <Reduced dimension plot 1>”: Takes vocal control of the requested panel.
- “Colour using <Column data | Feature name | …>”: Changes the colouring mode of the panel under vocal control.
- “Colour by <…>”: Changes the colouring
covariate (e.g. gene name,
colData
column name) of the panel under vocal control. - “Receive selection from <Reduced dimension plot 1>”: Makes the panel under vocal control receive the point selection from the requested panel.
- “Send selection to <Reduced dimension plot 1>”: Makes the requested panel receive the point selection from the panel under vocal control.
- “Good <boy | girl>!”: If the app is behaving well, throw it a bone!
Session Info
sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.5 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] iSEE_2.17.4 SingleCellExperiment_1.27.2
#> [3] SummarizedExperiment_1.35.3 Biobase_2.65.1
#> [5] GenomicRanges_1.57.1 GenomeInfoDb_1.41.2
#> [7] IRanges_2.39.2 S4Vectors_0.43.2
#> [9] BiocGenerics_0.51.3 MatrixGenerics_1.17.0
#> [11] matrixStats_1.4.1 BiocStyle_2.33.1
#>
#> loaded via a namespace (and not attached):
#> [1] rlang_1.1.4 magrittr_2.0.3 shinydashboard_0.7.2
#> [4] clue_0.3-65 GetoptLong_1.0.5 compiler_4.4.1
#> [7] mgcv_1.9-1 png_0.1-8 systemfonts_1.1.0
#> [10] vctrs_0.6.5 pkgconfig_2.0.3 shape_1.4.6.1
#> [13] crayon_1.5.3 fastmap_1.2.0 XVector_0.45.0
#> [16] fontawesome_0.5.2 utf8_1.2.4 promises_1.3.0
#> [19] rmarkdown_2.28 UCSC.utils_1.1.0 shinyAce_0.4.2
#> [22] ragg_1.3.3 xfun_0.48 zlibbioc_1.51.1
#> [25] cachem_1.1.0 jsonlite_1.8.9 listviewer_4.0.0
#> [28] later_1.3.2 DelayedArray_0.31.14 parallel_4.4.1
#> [31] cluster_2.1.6 R6_2.5.1 bslib_0.8.0
#> [34] RColorBrewer_1.1-3 jquerylib_0.1.4 Rcpp_1.0.13
#> [37] bookdown_0.40 iterators_1.0.14 knitr_1.48
#> [40] httpuv_1.6.15 Matrix_1.7-0 splines_4.4.1
#> [43] igraph_2.0.3 tidyselect_1.2.1 abind_1.4-8
#> [46] yaml_2.3.10 doParallel_1.0.17 codetools_0.2-20
#> [49] miniUI_0.1.1.1 lattice_0.22-6 tibble_3.2.1
#> [52] shiny_1.9.1 evaluate_1.0.0 desc_1.4.3
#> [55] circlize_0.4.16 pillar_1.9.0 BiocManager_1.30.25
#> [58] DT_0.33 foreach_1.5.2 shinyjs_2.1.0
#> [61] generics_0.1.3 ggplot2_3.5.1 munsell_0.5.1
#> [64] scales_1.3.0 xtable_1.8-4 glue_1.8.0
#> [67] tools_4.4.1 colourpicker_1.3.0 fs_1.6.4
#> [70] grid_4.4.1 colorspace_2.1-1 nlme_3.1-166
#> [73] GenomeInfoDbData_1.2.13 vipor_0.4.7 cli_3.6.3
#> [76] textshaping_0.4.0 fansi_1.0.6 viridisLite_0.4.2
#> [79] S4Arrays_1.5.10 ComplexHeatmap_2.21.1 dplyr_1.1.4
#> [82] gtable_0.3.5 rintrojs_0.3.4 sass_0.4.9
#> [85] digest_0.6.37 SparseArray_1.5.43 ggrepel_0.9.6
#> [88] rjson_0.2.23 htmlwidgets_1.6.4 memoise_2.0.1
#> [91] htmltools_0.5.8.1 pkgdown_2.1.1 lifecycle_1.0.4
#> [94] shinyWidgets_0.8.7 httr_1.4.7 GlobalOptions_0.1.2
#> [97] mime_0.12
# devtools::session_info()