default search action
Samuel L. Smith
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c13]Antonio Orvieto, Soham De, Caglar Gulcehre, Razvan Pascanu, Samuel L. Smith:
Universality of Linear Recurrences Followed by Non-linear Projections: Finite-Width Guarantees and Benefits of Complex Eigenvalues. ICML 2024 - [i25]Soham De, Samuel L. Smith, Anushan Fernando, Aleksandar Botev, George-Cristian Muraru, Albert Gu, Ruba Haroun, Leonard Berrada, Yutian Chen, Srivatsan Srinivasan, Guillaume Desjardins, Arnaud Doucet, David Budden, Yee Whye Teh, Razvan Pascanu, Nando de Freitas, Caglar Gulcehre:
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models. CoRR abs/2402.19427 (2024) - [i24]Aleksandar Botev, Soham De, Samuel L. Smith, Anushan Fernando, George-Cristian Muraru, Ruba Haroun, Leonard Berrada, Razvan Pascanu, Pier Giuseppe Sessa, Robert Dadashi, Léonard Hussenot, Johan Ferret, Sertan Girgin, Olivier Bachem, Alek Andreev, Kathleen Kenealy, Thomas Mesnard, Cassidy Hardin, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Armand Joulin, Noah Fiedel, Evan Senter, Yutian Chen, Srivatsan Srinivasan, Guillaume Desjardins, David Budden, Arnaud Doucet, Sharad Vikram, Adam Paszke, Trevor Gale, Sebastian Borgeaud, Charlie Chen, Andy Brock, Antonia Paterson, Jenny Brennan, Meg Risdal, Raj Gundluru, Nesh Devanathan, Paul Mooney, Nilay Chauhan, Phil Culliton, Luiz GUStavo Martins, Elisa Bandy, David Huntsperger, Glenn Cameron, Arthur Zucker, Tris Warkentin, Ludovic Peran, Minh Giang, Zoubin Ghahramani, Clément Farabet, Koray Kavukcuoglu, Demis Hassabis, Raia Hadsell, Yee Whye Teh, Nando de Frietas:
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models. CoRR abs/2404.07839 (2024) - 2023
- [c12]Bobby He, James Martens, Guodong Zhang, Aleksandar Botev, Andrew Brock, Samuel L. Smith, Yee Whye Teh:
Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation. ICLR 2023 - [c11]Antonio Orvieto, Samuel L. Smith, Albert Gu, Anushan Fernando, Çaglar Gülçehre, Razvan Pascanu, Soham De:
Resurrecting Recurrent Neural Networks for Long Sequences. ICML 2023: 26670-26698 - [i23]Bobby He, James Martens, Guodong Zhang, Aleksandar Botev, Andrew Brock, Samuel L. Smith, Yee Whye Teh:
Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation. CoRR abs/2302.10322 (2023) - [i22]Sahra Ghalebikesabi, Leonard Berrada, Sven Gowal, Ira Ktena, Robert Stanforth, Jamie Hayes, Soham De, Samuel L. Smith, Olivia Wiles, Borja Balle:
Differentially Private Diffusion Models Generate Useful Synthetic Images. CoRR abs/2302.13861 (2023) - [i21]Antonio Orvieto, Samuel L. Smith, Albert Gu, Anushan Fernando, Çaglar Gülçehre, Razvan Pascanu, Soham De:
Resurrecting Recurrent Neural Networks for Long Sequences. CoRR abs/2303.06349 (2023) - [i20]Antonio Orvieto, Soham De, Çaglar Gülçehre, Razvan Pascanu, Samuel L. Smith:
On the Universality of Linear Recurrences Followed by Nonlinear Projections. CoRR abs/2307.11888 (2023) - [i19]Leonard Berrada, Soham De, Judy Hanwen Shen, Jamie Hayes, Robert Stanforth, David Stutz, Pushmeet Kohli, Samuel L. Smith, Borja Balle:
Unlocking Accuracy and Fairness in Differentially Private Image Classification. CoRR abs/2308.10888 (2023) - [i18]Samuel L. Smith, Andrew Brock, Leonard Berrada, Soham De:
ConvNets Match Vision Transformers at Scale. CoRR abs/2310.16764 (2023) - 2022
- [i17]Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, Borja Balle:
Unlocking High-Accuracy Differentially Private Image Classification through Scale. CoRR abs/2204.13650 (2022) - 2021
- [c10]Andrew Brock, Soham De, Samuel L. Smith:
Characterizing signal propagation to close the performance gap in unnormalized ResNets. ICLR 2021 - [c9]Samuel L. Smith, Benoit Dherin, David G. T. Barrett, Soham De:
On the Origin of Implicit Regularization in Stochastic Gradient Descent. ICLR 2021 - [c8]Andy Brock, Soham De, Samuel L. Smith, Karen Simonyan:
High-Performance Large-Scale Image Recognition Without Normalization. ICML 2021: 1059-1071 - [i16]Andrew Brock, Soham De, Samuel L. Smith:
Characterizing signal propagation to close the performance gap in unnormalized ResNets. CoRR abs/2101.08692 (2021) - [i15]Samuel L. Smith, Benoit Dherin, David G. T. Barrett, Soham De:
On the Origin of Implicit Regularization in Stochastic Gradient Descent. CoRR abs/2101.12176 (2021) - [i14]Andrew Brock, Soham De, Samuel L. Smith, Karen Simonyan:
High-Performance Large-Scale Image Recognition Without Normalization. CoRR abs/2102.06171 (2021) - [i13]Stanislav Fort, Andrew Brock, Razvan Pascanu, Soham De, Samuel L. Smith:
Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error. CoRR abs/2105.13343 (2021) - [i12]Tudor Berariu, Wojciech Czarnecki, Soham De, Jörg Bornschein, Samuel L. Smith, Razvan Pascanu, Claudia Clopath:
A study on the plasticity of neural networks. CoRR abs/2106.00042 (2021) - 2020
- [c7]Samuel L. Smith, Erich Elsen, Soham De:
On the Generalization Benefit of Noise in Stochastic Gradient Descent. ICML 2020: 9058-9067 - [c6]Soham De, Samuel L. Smith:
Batch Normalization Biases Residual Blocks Towards the Identity Function in Deep Networks. NeurIPS 2020 - [i11]Soham De, Samuel L. Smith:
Batch Normalization Biases Deep Residual Networks Towards Shallow Paths. CoRR abs/2002.10444 (2020) - [i10]Samuel L. Smith, Erich Elsen, Soham De:
On the Generalization Benefit of Noise in Stochastic Gradient Descent. CoRR abs/2006.15081 (2020) - [i9]Ben Adlam, Jasper Snoek, Samuel L. Smith:
Cold Posteriors and Aleatoric Uncertainty. CoRR abs/2008.00029 (2020) - [i8]Pierre H. Richemond, Jean-Bastien Grill, Florent Altché, Corentin Tallec, Florian Strub, Andrew Brock, Samuel L. Smith, Soham De, Razvan Pascanu, Bilal Piot, Michal Valko:
BYOL works even without batch statistics. CoRR abs/2010.10241 (2020)
2010 – 2019
- 2019
- [c5]Daniel S. Park, Jascha Sohl-Dickstein, Quoc V. Le, Samuel L. Smith:
The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study. ICML 2019: 5042-5051 - [i7]Daniel S. Park, Jascha Sohl-Dickstein, Quoc V. Le, Samuel L. Smith:
The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study. CoRR abs/1905.03776 (2019) - 2018
- [c4]Samuel L. Smith, Pieter-Jan Kindermans, Chris Ying, Quoc V. Le:
Don't Decay the Learning Rate, Increase the Batch Size. ICLR (Poster) 2018 - [c3]Samuel L. Smith, Quoc V. Le:
A Bayesian Perspective on Generalization and Stochastic Gradient Descent. ICLR (Poster) 2018 - [c2]Vitalii Zhelezniak, Dan Busbridge, April Shen, Samuel L. Smith, Nils Y. Hammerla:
Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks. ICLR (Workshop) 2018 - [i6]Vitalii Zhelezniak, Dan Busbridge, April Shen, Samuel L. Smith, Nils Y. Hammerla:
Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks. CoRR abs/1805.03435 (2018) - [i5]Samuel L. Smith, Daniel Duckworth, Quoc V. Le, Jascha Sohl-Dickstein:
Stochastic natural gradient descent draws posterior samples in function space. CoRR abs/1806.09597 (2018) - 2017
- [c1]Samuel L. Smith, David H. P. Turban, Steven Hamblin, Nils Y. Hammerla:
Offline bilingual word vectors, orthogonal transformations and the inverted softmax. ICLR (Poster) 2017 - [i4]Samuel L. Smith, David H. P. Turban, Steven Hamblin, Nils Y. Hammerla:
Offline bilingual word vectors, orthogonal transformations and the inverted softmax. CoRR abs/1702.03859 (2017) - [i3]Samuel L. Smith, Quoc V. Le:
A Bayesian Perspective on Generalization and Stochastic Gradient Descent. CoRR abs/1710.06451 (2017) - [i2]Samuel L. Smith, Pieter-Jan Kindermans, Quoc V. Le:
Don't Decay the Learning Rate, Increase the Batch Size. CoRR abs/1711.00489 (2017) - 2016
- [i1]Samuel L. Smith:
Monte Carlo Sort for unreliable human comparisons. CoRR abs/1612.08555 (2016)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:12 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint