- 日精进
张金蕊_83ba
敬爱的李老师,智慧的高管们,大家晚上好,我是临沂百度眼镜的张金蕊,今天是我日精进第202天,跟大家分享我今天的感悟和成长,每天进步一点点,距离成功便不远!2019.3.21比学习:一个人的格局,会意味着你成就的事业一个人的毅力,会支持你能够走多远2比改变放大自己的格局,提升自己的专业知识,让自己不断值钱。3比谦卑:成功不是属于跑得最快的人,而是不断在跑的人4比付出:有付出就有收获,付出才会杰出,感
- 量子计算解决气候变化:科学家找到了新方法
大力出奇迹985
量子计算
气候变化已成为全球面临的严峻挑战,传统计算方法在应对与之相关的复杂问题时存在诸多局限。而量子计算作为新兴技术,为解决气候变化难题带来曙光。本文深入剖析科学家利用量子计算应对气候变化的新方法。量子计算凭借独特的量子比特与量子特性,在加速气候模型计算、优化模型参数、预测极端天气事件等方面展现出巨大优势。同时,在可再生能源整合、电网管理、碳捕获等实际应用场景中也发挥着重要作用。尽管目前面临硬件和算法等方
- Jupyter Notebook:数据科学的“瑞士军刀”
a小胡哦
机器学习基础人工智能机器学习
在数据科学的世界里,JupyterNotebook是一个不可或缺的工具,它就像是数据科学家手中的“瑞士军刀”,功能强大且灵活多变。今天,就让我们一起深入了解这个神奇的工具。一、JupyterNotebook是什么?JupyterNotebook是一个开源的Web应用程序,它允许你创建和共享包含实时代码、方程、可视化和解释性文本的文档。它支持多种编程语言,其中Python是最常用的语言之一。Jupy
- Python数据可视化:用代码绘制数据背后的故事
AAEllisonPang
Python信息可视化python开发语言
引言:当数据会说话在数据爆炸的时代,可视化是解锁数据价值的金钥匙。Python凭借其丰富的可视化生态库,已成为数据科学家的首选工具。本文将带您从基础到高级,探索如何用Python将冰冷数字转化为引人入胜的视觉叙事。一、基础篇:二维可视化的艺术表达1.1Matplotlib:可视化领域的瑞士军刀importmatplotlib.pyplotaspltimportnumpyasnpx=np.linsp
- 跳表:来自概率的优雅平衡
allenXer
算法与数据结构redis数据结构算法python学习
跳表:来自概率的优雅平衡从抛硬币到Redis核心,跳表如何用随机性颠覆数据结构设计引言:平衡的艺术在计算机科学的世界里,数据结构的设计者一直在追求一种完美平衡:快速查询的同时保持高效的插入和删除。平衡树(如AVL树、红黑树)曾是这个领域的王者,但它们的复杂性令人望而生畏。直到1989年,计算机科学家WilliamPugh提出了一种革命性的数据结构——跳表(SkipList),它用概率的魔力实现了近
- 其实我们都是普通人
老王_不知道
大家都还记得小时侯的梦想吗?科学家,发明家,明星,奥运冠军......,可最终实现当初梦想的寥寥无几,大家还能想起来是因为什么原因放弃的吗?因为其实我们都是普通人,对,就是这句话,成了我们几乎所有人放弃梦想的借口,多么冠冕堂皇而又有说服力的借口啊,因为放眼望去我们身边确实大都是普通人。记得有一次跟一位老同学聊天,偶然间发现他也是笑来老师的粉丝,还参加过当地组织的笑来老师学习会,我就很激动地问他,通
- 使命
高能磷酸
当不知道自己为什么要做一些事情时,就要为自己树立一个使命,让自己在前行的路上走的更坚定。就像那些老一辈的科学家,只为国家做贡献,不追求物质上的丰富,不追求个人的得失。用高尚的人格魅力感染着每一个人。我也想向他们一样,成为一个他们那样的人耐的住寂寞,守的住独清贫,坐的住冷板凳,成就一番贡献。图片发自App抓住一个老师,就不要放手,让她成为你人生的跳板,借力来提升自己的档次。在一开始,就不要对他有批判
- 第1章 机械少女来到地球
goldengray
很多年之后,银依旧记得她诞生的那一天。那是一个盛夏的黄昏,夕阳斜斜地透过窗照进来,天地都被镀上一层梦幻般的玫瑰金,当今世界最优秀的青年科学家麦达教授所负责的科大Silver实验室自然也不例外。银就是在那里诞生的。麦达要研制一个机器人,并借助她来进行一些研究。按照麦达的设定,这个机器人外表与16岁的女孩无异,智商则要更高一些,尤其擅长理工类学科;最重要的是,她将完全听从他的命令。麦达给机器人起了个昵
- 工作室读书分享——《语文课程标准》(十三)
水墨青花_7e78
王引娣老师分享:2022新课标语文2.发展型学习任务群。实用性阅读与交流第四学段(七到九年级)学习内容(1)阅读叙事性和说明性文本发现,欣赏表达和交流,家庭生活,学校生活,社会生活和大自然的美好,热爱生活,感恩生活。(2)阅读悦科技作品,欣赏人类的科学创造,关注祖国的科技创新和社会主义建设成就,交流自己的发现和体会;学习为创造人类美好生活作出重要贡献的杰出人物的事迹,激发创造精神。(3)学习跨媒介
- 专升本重庆第一章计算机基础知识
1.1.1历史人物1.图灵阿兰・麦席森・图灵(AlanMathisonTuring,1912年6月23日-1954年6月7日)是英国著名的计算机科学家、数学家、逻辑学家、密码学家和理论生物学家,被誉为“计算机科学与人工智能之父。图灵相关事件:(1)提出“图灵机”概念(1936年):一种抽象的计算模型,不是真正的计算机。(2)提出“图灵测试”与人工智能构想(1950年):用人机对话的方式检验机器是否
- 如何与他人建立人际连接?应用脑科学的两个小技巧,快速结交朋友——《效率脑科学》精读分享39
峰哥读写思
如何建立人际连接?这是《效率脑科学》一书,精读分享的第39篇。在职场中,如果能与他人建立起连接,成为朋友,就更容易有良好的协作。那么,如何才能与他人建立起人际连接呢?今天我们要聊得话题是,用两个技巧帮助你,快速的与陌生人成为朋友。脑科学家告诉我们,大脑的运作机制,会很快的将陌生人,归类为朋友或者敌人,当缺乏正面证据时,大脑会默认对方是敌人。如果大脑认为是朋友,就会产生接近效应,这对协作是有利的。而
- 大数据工程师:职责与技能全景图 -- 从“数据搬运工”到“价值架构师”
大数据工程师:职责与技能全景图从“数据搬运工”到“价值架构师”在抖音的推荐流里精准蹦出你刚想买的球鞋、在双十一零点让支付成功率提升0.1%、在流感季来临前2周把奥司他韦铺到正确门店……这些“魔法”背后都站着同一群人——大数据工程师(BigDataEngineer)。他们不是数据分析师,也不是算法科学家,而是让**海量数据从“原材料”变成“生产线燃料”**的隐形架构师。本文用一张“职责地图”+一份“
- 怀念师老
94河北王亚男笑雅笑春风
国之栋梁风范长存。晚上,在朋友圈看到了师昌绪先生2011年衣锦还乡的视频,村民们载歌载舞,扭起了秧歌,喜气洋洋地欢迎90高龄的师老回到家乡的场景。他精神抖擞,兴致勃勃地参观了自己关心资助的大营村的小学校园,并发表热情讲话。视频中看到了我们学校的领导,县委领导,村干部等,把师老围得水泄不通。睹大师风采,我内心无比激动。师老师是世界级科学家,1918年11月15日,即103年前的今天,“中国高温合金之
- 解密世界顶级咨询公司麦肯锡克敌制胜的法宝 | 读书笔记(4)
大海的成长记录
历史上的今日1934年抗日将领吉鸿昌慷慨就义1988年世界最大木板典籍《乾隆版大藏经》重印1993年科学家周培源逝世电梯法则:商务沟通必须有高效率所谓的电梯法则,就是假设一种经典的场景:你在电梯里面游说客户,看看在这短短30秒内是否能够打动客户。30秒电梯法则的重点1.开头要精彩进行商务沟通时,最重要的就是调动客户的胃口,可以从与他相关的利益点着手,让他产生兴趣。2.逐步引导客户先声夺人之后,就要
- 做钱喜欢的工作,过钱喜欢的生活
胡梓棋阿霞美发
读《松浦弥太郎的100个基本》5的感想钱喜欢什么样的工作方式呢?有人说钱喜欢不忘初心,撸起袖子干;也有人说钱喜欢被消费;我觉得钱喜欢让人变得美好的工作,不管什么工作让人变得越来越好,积极向上就是钱喜欢的工作方式吧!农民们辛勤的耕种丰收的果实给人温饱,工人努力工作给人们提供便捷方便的商品,设计师设计出优良的产品满足消费者的各种需求。企业家发挥才能服务社会,科学家们发明先进科技使人类文明更进步。教师培
- 人生来就能抵制奶酪蛋糕的诱惑
北夜极星
今天阅读《自控力》的第二章意志力的本能:人生来就能抵制奶酪蛋糕的诱惑。面对美酒佳肴、华服美饰、香车美女时,人人可能都有挪不动脚步的那一刻,想要尝试,屈服于诱惑;当然,你也有可能停下脚步,深吸一口气,然后掉头离开,战胜冲动。科学家研究发现,自控力不仅与心理有关,更与生理相关。只有大脑和身体同时作用,我们才能够足够的力量克服冲动。也就是通过锻炼,我们可以将身体调整到抵抗诱惑的最佳状态,让自控力成为自身
- 百年校庆,实习生骂我是捞女和皎月柳妍妍热门小说完结_免费小说在线阅读百年校庆,实习生骂我是捞女(和皎月柳妍妍)
狂战书楼
小说:《百年校庆,实习生骂我是捞女》主角:和皎月柳妍妍简介:国庆节,母校百年校庆,我作为杰出校友致辞。在我公司实习的学妹却指着我骂说我是捞女。“我实习的时候很好奇她年纪轻轻,怎么能开那么大一个公司,原来她不仅勾引我的男朋友,还勾引我的公公。”校庆现场直播,一时间我成了全网的笑话。学校当场宣布要把我从杰出校友名单除名。我当着数十万网友的面夺了她给男友打去的电话。“喂,你是哪位?我什么时候勾引的你呀?
- 数据科学家必学:SQL+Python + 机器学习全链路
大力出奇迹985
sqlpython机器学习
一、数据科学浪潮与核心技能基石在这个数据爆炸的时代,数据宛如一座蕴藏无限价值的宝藏矿山,等待着被深度挖掘与有效利用。数据科学家便肩负着这一使命,他们是数字世界的“淘金者”,运用专业技能从海量数据中提炼出有价值的信息,为企业决策指引方向,助力其在激烈的市场竞争中脱颖而出。而在数据科学家的技能工具箱中,SQL、Python以及机器学习是最为闪耀的“三件套”。SQL作为与数据库沟通的桥梁,让数据科学家能
- 罕见刘少奇手写电报真迹,结体严谨,厚重遒劲,堪称书法佳作
罗雄金书法
刘少奇是我们国家近代的一位杰出的伟人,他的一生功绩卓著,同时他也是一位博学多才的人,毛主席曾盛赞刘少奇道“三天不学习,赶不上刘少奇”,由此可以看出他是一位持续学习的人。对于刘少奇的书法,我们大多数人都是了解得不多的,那么刘少奇的书法如何呢?我们今天就来看看他的墨迹,看看他的书法作品,了解了解他的书法。刘少奇的书法是以行书为主的,他的书法不像毛主席那般的豪迈狂放,而是内敛含蓄的,这一点上与周总理的字
- 假如现在是21世纪
大玉小儒
现在的我,是21世纪最有名的科学家。每天都过着无忧无虑的生活,可是最近有一件事让我非常烦恼――现在环境加剧恶化,到处都是垃圾,食品包装袋,垃圾桶也被垃圾埋住了!我最近想要发明一种可以自动吸取垃圾的机器,可一直到现在都没有头绪。法子都想尽了,更让人心烦的是――市长又规定要在三个月之中制造出这种机器,否则就会被枪毙!我废寝忘食地去想去研究,可三个月过去了,我以旧没有研制出这种机器,在行刑的那一天,我吓
- 探秘VCSI:一款创新的视觉内容识别工具
探秘VCSI:一款创新的视觉内容识别工具是一个基于深度学习的开源项目,其主要目标是帮助开发者和数据科学家进行高效、精确的视觉内容识别。在这个数字时代,我们每天都被大量的图像和视频所包围,VCSI提供了强大的工具,使得机器能够理解这些媒体内容,从而打开了一扇全新的应用之门。技术解析VCSI基于现代神经网络架构,特别是卷积神经网络(CNNs),用于图像特征提取。它利用预训练模型,如VGG16和ResN
- 《在与众不同的教室里》读书笔记(1)
河南张俊红
在九月份开启一本新书的阅读,《在与众不同的教室里》这本书主要记录的是八位美国当代名师的精神档案。美国最杰出的教师被称英雄。在美国人看来,教育如其说是神圣的,不如说是极端困难的。地球上的每间教室,都是英雄的诞生之地,正如哲人康德所说:教育是世间最难的事。教室里有什么的教师,就有什么的教育!第一位:雷夫.艾斯奎斯美国传奇教师雷夫艾斯奎斯,在同一所学校的同一间教室里,年复一年的教同一个年龄段学生长达20
- 美嫺读书笔记
美嫺
环境对胎儿的影响基因不是万能的,环境对孩子心理的发展也是不能忽略的。大脑发育最迅速的阶段是在胚胎早期,同时这个时期也是胎儿最容易受到妈妈传递的环境的时候。药物可以通过母体血流经胎盘进入胎儿体内,对胎儿产生不良影响影响胎儿的正常发育。酒精也是可以通过胎盘对成长中的胚胎产生显著的危害和永久性影响。研究表明,环境中的有害物质和污染物都会对胎儿的生长发育产生影响。辐射,目前科学家确定的是X射线一定会对胎儿
- Jupyter Notebook 黑科技:数据科学家必备 10 个技巧
在数据科学的浩瀚宇宙中,JupyterNotebook宛如一颗璀璨的明星,照亮着数据科学家们前行的道路。它以其强大的交互性、丰富的功能,成为了数据处理、分析与可视化的得力助手。今天,就让我们一同揭开JupyterNotebook的神秘面纱,探寻那些能让数据科学家如虎添翼的10个黑科技技巧。一、魔法命令,开启高效之旅魔法命令堪称JupyterNotebook的一大瑰宝。以%开头的单行魔法命令和以%%
- 【如何进行孩子的早期职业启蒙教育】
雨曦的记忆
记得小时候我的理想是当个语文科学家,上大学后,我的理想是当名记者,现在回想起来,我的职业生涯跟写作和记者唯一有关的时刻就是我现在的每天日更和大二寒假干了十几天的武汉晨报实习记者了。一路走来没有好或者不好,只是现在回想起来,如果在成长的道路上,能有前辈给予一定的指导我应该可以更早的踏入到自己适合的职业中吧。当然,现在的职业其实跟我的兴趣特长也有所关联。那在教育宝宝的过程中,如何能让她的未来发展得更顺
- 深井效应
1天1本书的煎蛋老师
1天至少读1本书打卡580天书名:《深井效应》用时:20分钟阅读心得:本书作者是娜丁·伯克·哈里斯,国际知名儿科医生,童年不良经历研究领域新锐科学家。作者结合自己多年的临场实践和30000份案例研究,揭示了童年不良经历对未来身体健康的影响,让我很震撼。原来我们身体的疾病可能源于童年的一些心理或者身体创伤,即使伤疤已经好了,但这种“毒性”却深入骨髓,等我们成年(或者没有成年就已经显现)后以疾病的形态
- 物联网技术:起源、发展、重点技术、应用场景与未来演进
boyedu
物联网域名物联网区块链
一、起源:从概念到术语的诞生1.早期萌芽(1980s-1990s)1982年:卡内基梅隆大学研究人员将可口可乐售货机接入互联网,实现远程库存监控,成为物联网最早的实践案例。1991年:施乐公司科学家首次提出“物联网”概念,描述通过互联网连接物理设备的愿景。1995年:比尔·盖茨在《未来之路》中预言“物物互联”,但受限于技术未能引起广泛关注。2.术语正式提出(1999年)KevinAshton:麻省
- Python探索性数据分析库之sweetviz使用详解
概要Sweetviz是一个Python库,专为数据科学家和分析师设计,旨在加速探索性数据分析(EDA)过程。在数据科学项目中,探索性数据分析是必不可少的步骤,但通常也是最耗时的环节之一。Sweetviz通过自动生成高度详细、交互式且美观的可视化报告,大幅简化了这一过程。该库能够对数据集进行全面分析,包括特征分布、相关性、缺失值和目标变量关系等,使用户能够快速理解数据结构和潜在模式,为后续的特征工程
- 2050年,机器人与人类竞逐诺奖
木木西里
不久前,一台实验机器人登上Nature封面,展示了8天做完688个实验的超强“战斗力”。不过机器人科学家们有更远大的目标,他们希望机器人能表现出更强的主动性,不断提出和验证假设,到2050年获得独立做出诺奖级研究成果的能力。最开始,先有了亚当。我们这里说的不是《圣经》中的第一个人类,而是第一台可完全自主展开科学研究,独立作出发现的机器。亚当看上去完全不像人类。它就是一只大盒子,相当于办公室里的一个
- 【觉行日记6】南极与现实世界
CoachWenjie
周末做了新一期的会客厅访谈。访谈的嘉宾是坐标美国的丽妮子,她是一位植物科学家,是一位两个孩子的妈妈,也是刚参加完【家园归航】项目的成员。归航项目是一个全球女性科学家的领导力项目,在长达一年的时间里,报名参加的成员会通过不断的一对一教练、团体教练等来探索自己的职业发展和个人成长,最后大家一起从阿根廷去往南极,在南极做一些科考。这是一个非常有意义且有意思的项目,所以我邀请了她来与大家讲讲这个过程中的故
- 关于旗正规则引擎中的MD5加密问题
何必如此
jspMD5规则加密
一般情况下,为了防止个人隐私的泄露,我们都会对用户登录密码进行加密,使数据库相应字段保存的是加密后的字符串,而非原始密码。
在旗正规则引擎中,通过外部调用,可以实现MD5的加密,具体步骤如下:
1.在对象库中选择外部调用,选择“com.flagleader.util.MD5”,在子选项中选择“com.flagleader.util.MD5.getMD5ofStr({arg1})”;
2.在规
- 【Spark101】Scala Promise/Future在Spark中的应用
bit1129
Promise
Promise和Future是Scala用于异步调用并实现结果汇集的并发原语,Scala的Future同JUC里面的Future接口含义相同,Promise理解起来就有些绕。等有时间了再仔细的研究下Promise和Future的语义以及应用场景,具体参见Scala在线文档:http://docs.scala-lang.org/sips/completed/futures-promises.html
- spark sql 访问hive数据的配置详解
daizj
spark sqlhivethriftserver
spark sql 能够通过thriftserver 访问hive数据,默认spark编译的版本是不支持访问hive,因为hive依赖比较多,因此打的包中不包含hive和thriftserver,因此需要自己下载源码进行编译,将hive,thriftserver打包进去才能够访问,详细配置步骤如下:
1、下载源码
2、下载Maven,并配置
此配置简单,就略过
- HTTP 协议通信
周凡杨
javahttpclienthttp通信
一:简介
HTTPCLIENT,通过JAVA基于HTTP协议进行点与点间的通信!
二: 代码举例
测试类:
import java
- java unix时间戳转换
g21121
java
把java时间戳转换成unix时间戳:
Timestamp appointTime=Timestamp.valueOf(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(new Date()))
SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd hh:m
- web报表工具FineReport常用函数的用法总结(报表函数)
老A不折腾
web报表finereport总结
说明:本次总结中,凡是以tableName或viewName作为参数因子的。函数在调用的时候均按照先从私有数据源中查找,然后再从公有数据源中查找的顺序。
CLASS
CLASS(object):返回object对象的所属的类。
CNMONEY
CNMONEY(number,unit)返回人民币大写。
number:需要转换的数值型的数。
unit:单位,
- java jni调用c++ 代码 报错
墙头上一根草
javaC++jni
#
# A fatal error has been detected by the Java Runtime Environment:
#
# EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00000000777c3290, pid=5632, tid=6656
#
# JRE version: Java(TM) SE Ru
- Spring中事件处理de小技巧
aijuans
springSpring 教程Spring 实例Spring 入门Spring3
Spring 中提供一些Aware相关de接口,BeanFactoryAware、 ApplicationContextAware、ResourceLoaderAware、ServletContextAware等等,其中最常用到de匙ApplicationContextAware.实现ApplicationContextAwaredeBean,在Bean被初始后,将会被注入 Applicati
- linux shell ls脚本样例
annan211
linuxlinux ls源码linux 源码
#! /bin/sh -
#查找输入文件的路径
#在查找路径下寻找一个或多个原始文件或文件模式
# 查找路径由特定的环境变量所定义
#标准输出所产生的结果 通常是查找路径下找到的每个文件的第一个实体的完整路径
# 或是filename :not found 的标准错误输出。
#如果文件没有找到 则退出码为0
#否则 即为找不到的文件个数
#语法 pathfind [--
- List,Set,Map遍历方式 (收集的资源,值得看一下)
百合不是茶
listsetMap遍历方式
List特点:元素有放入顺序,元素可重复
Map特点:元素按键值对存储,无放入顺序
Set特点:元素无放入顺序,元素不可重复(注意:元素虽然无放入顺序,但是元素在set中的位置是有该元素的HashCode决定的,其位置其实是固定的)
List接口有三个实现类:LinkedList,ArrayList,Vector
LinkedList:底层基于链表实现,链表内存是散乱的,每一个元素存储本身
- 解决SimpleDateFormat的线程不安全问题的方法
bijian1013
javathread线程安全
在Java项目中,我们通常会自己写一个DateUtil类,处理日期和字符串的转换,如下所示:
public class DateUtil01 {
private SimpleDateFormat dateformat = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
public void format(Date d
- http请求测试实例(采用fastjson解析)
bijian1013
http测试
在实际开发中,我们经常会去做http请求的开发,下面则是如何请求的单元测试小实例,仅供参考。
import java.util.HashMap;
import java.util.Map;
import org.apache.commons.httpclient.HttpClient;
import
- 【RPC框架Hessian三】Hessian 异常处理
bit1129
hessian
RPC异常处理概述
RPC异常处理指是,当客户端调用远端的服务,如果服务执行过程中发生异常,这个异常能否序列到客户端?
如果服务在执行过程中可能发生异常,那么在服务接口的声明中,就该声明该接口可能抛出的异常。
在Hessian中,服务器端发生异常,可以将异常信息从服务器端序列化到客户端,因为Exception本身是实现了Serializable的
- 【日志分析】日志分析工具
bit1129
日志分析
1. 网站日志实时分析工具 GoAccess
http://www.vpsee.com/2014/02/a-real-time-web-log-analyzer-goaccess/
2. 通过日志监控并收集 Java 应用程序性能数据(Perf4J)
http://www.ibm.com/developerworks/cn/java/j-lo-logforperf/
3.log.io
和
- nginx优化加强战斗力及遇到的坑解决
ronin47
nginx 优化
先说遇到个坑,第一个是负载问题,这个问题与架构有关,由于我设计架构多了两层,结果导致会话负载只转向一个。解决这样的问题思路有两个:一是改变负载策略,二是更改架构设计。
由于采用动静分离部署,而nginx又设计了静态,结果客户端去读nginx静态,访问量上来,页面加载很慢。解决:二者留其一。最好是保留apache服务器。
来以下优化:
- java-50-输入两棵二叉树A和B,判断树B是不是A的子结构
bylijinnan
java
思路来自:
http://zhedahht.blog.163.com/blog/static/25411174201011445550396/
import ljn.help.*;
public class HasSubtree {
/**Q50.
* 输入两棵二叉树A和B,判断树B是不是A的子结构。
例如,下图中的两棵树A和B,由于A中有一部分子树的结构和B是一
- mongoDB 备份与恢复
开窍的石头
mongDB备份与恢复
Mongodb导出与导入
1: 导入/导出可以操作的是本地的mongodb服务器,也可以是远程的.
所以,都有如下通用选项:
-h host 主机
--port port 端口
-u username 用户名
-p passwd 密码
2: mongoexport 导出json格式的文件
- [网络与通讯]椭圆轨道计算的一些问题
comsci
网络
如果按照中国古代农历的历法,现在应该是某个季节的开始,但是由于农历历法是3000年前的天文观测数据,如果按照现在的天文学记录来进行修正的话,这个季节已经过去一段时间了。。。。。
也就是说,还要再等3000年。才有机会了,太阳系的行星的椭圆轨道受到外来天体的干扰,轨道次序发生了变
- 软件专利如何申请
cuiyadll
软件专利申请
软件技术可以申请软件著作权以保护软件源代码,也可以申请发明专利以保护软件流程中的步骤执行方式。专利保护的是软件解决问题的思想,而软件著作权保护的是软件代码(即软件思想的表达形式)。例如,离线传送文件,那发明专利保护是如何实现离线传送文件。基于相同的软件思想,但实现离线传送的程序代码有千千万万种,每种代码都可以享有各自的软件著作权。申请一个软件发明专利的代理费大概需要5000-8000申请发明专利可
- Android学习笔记
darrenzhu
android
1.启动一个AVD
2.命令行运行adb shell可连接到AVD,这也就是命令行客户端
3.如何启动一个程序
am start -n package name/.activityName
am start -n com.example.helloworld/.MainActivity
启动Android设置工具的命令如下所示:
# am start -
- apache虚拟机配置,本地多域名访问本地网站
dcj3sjt126com
apache
现在假定你有两个目录,一个存在于 /htdocs/a,另一个存在于 /htdocs/b 。
现在你想要在本地测试的时候访问 www.freeman.com 对应的目录是 /xampp/htdocs/freeman ,访问 www.duchengjiu.com 对应的目录是 /htdocs/duchengjiu。
1、首先修改C盘WINDOWS\system32\drivers\etc目录下的
- yii2 restful web服务[速率限制]
dcj3sjt126com
PHPyii2
速率限制
为防止滥用,你应该考虑增加速率限制到您的API。 例如,您可以限制每个用户的API的使用是在10分钟内最多100次的API调用。 如果一个用户同一个时间段内太多的请求被接收, 将返回响应状态代码 429 (这意味着过多的请求)。
要启用速率限制, [[yii\web\User::identityClass|user identity class]] 应该实现 [[yii\filter
- Hadoop2.5.2安装——单机模式
eksliang
hadoophadoop单机部署
转载请出自出处:http://eksliang.iteye.com/blog/2185414 一、概述
Hadoop有三种模式 单机模式、伪分布模式和完全分布模式,这里先简单介绍单机模式 ,默认情况下,Hadoop被配置成一个非分布式模式,独立运行JAVA进程,适合开始做调试工作。
二、下载地址
Hadoop 网址http:
- LoadMoreListView+SwipeRefreshLayout(分页下拉)基本结构
gundumw100
android
一切为了快速迭代
import java.util.ArrayList;
import org.json.JSONObject;
import android.animation.ObjectAnimator;
import android.os.Bundle;
import android.support.v4.widget.SwipeRefreshLayo
- 三道简单的前端HTML/CSS题目
ini
htmlWeb前端css题目
使用CSS为多个网页进行相同风格的布局和外观设置时,为了方便对这些网页进行修改,最好使用( )。http://hovertree.com/shortanswer/bjae/7bd72acca3206862.htm
在HTML中加入<table style=”color:red; font-size:10pt”>,此为( )。http://hovertree.com/s
- overrided方法编译错误
kane_xie
override
问题描述:
在实现类中的某一或某几个Override方法发生编译错误如下:
Name clash: The method put(String) of type XXXServiceImpl has the same erasure as put(String) of type XXXService but does not override it
当去掉@Over
- Java中使用代理IP获取网址内容(防IP被封,做数据爬虫)
mcj8089
免费代理IP代理IP数据爬虫JAVA设置代理IP爬虫封IP
推荐两个代理IP网站:
1. 全网代理IP:http://proxy.goubanjia.com/
2. 敲代码免费IP:http://ip.qiaodm.com/
Java语言有两种方式使用代理IP访问网址并获取内容,
方式一,设置System系统属性
// 设置代理IP
System.getProper
- Nodejs Express 报错之 listen EADDRINUSE
qiaolevip
每天进步一点点学习永无止境nodejs纵观千象
当你启动 nodejs服务报错:
>node app
Express server listening on port 80
events.js:85
throw er; // Unhandled 'error' event
^
Error: listen EADDRINUSE
at exports._errnoException (
- C++中三种new的用法
_荆棘鸟_
C++new
转载自:http://news.ccidnet.com/art/32855/20100713/2114025_1.html
作者: mt
其一是new operator,也叫new表达式;其二是operator new,也叫new操作符。这两个英文名称起的也太绝了,很容易搞混,那就记中文名称吧。new表达式比较常见,也最常用,例如:
string* ps = new string("
- Ruby深入研究笔记1
wudixiaotie
Ruby
module是可以定义private方法的
module MTest
def aaa
puts "aaa"
private_method
end
private
def private_method
puts "this is private_method"
end
end
[–]mdooder 41 指標 1 年 前
Hello Prof. Bengio, What motivates you to stay in academia? What do you think about corporate research labs in terms of productivity and innovation when compared to academic labs. Does research flexibility (doing what you want, more or less) play a large role in this decision?
[–]yoshua_bengioProf. Bengio 29 指標 1 年 前
I like academia because I can choose what to work on, I can choose to work on long-term goals, I can work for the benefit of humanity rather than for a specific company, and I can talk about my work freely. Note that to different degrees, my esteemed colleagues in large industrial labs also enjoy some of that freedom.
[–]alecradford 30 指標 1 年 前*
Hi there! I'm an undergrad and your work combined with Hinton's is a huge inspiration to me! A bunch of questions, so feel free to answer all or none!
Hinton semi-recently offered an awesome MOOC on Coursera over NNs. The resources and lectures it provided are what allowed me and many others to build homebrew nets and really get into the field. It would be a great resource if another researcher at the forefront of the field offered their own take, do you have any plans for something like this?
As a leading professor in the field, how do you personally view the resurgence of interest in modern NN applications? Do you believe it's well deserved recognition, guilty of overhype, some mixture of the two, or something completely different! On a similar note, how do you feel about the portrayal of modern NN research in popular literature?
I'm interested in using unsupervised techniques to learn automated data augmentations/corruptions for increasing generalization performance, which I hope is a promising hybrid of supervised and unsupervised learning that's different from traditional pretraining. A lot of advances have been made using "simple" data augmentations/corruptions pioneered in your lab like gaussian noise corruption and what we now call input dropout in the context of DAEs. Preliminary results on MNIST seem successful (~0.8% permutation invariant) and I can send code if you are interested but admittedly I'm just an undergrad with no formal research experience. Do you see this as an area with potential and could you point me to any resources or papers that you are aware of - I've had a hard time finding them.
No one has a crystal ball, but what do you see as the most interesting areas of research for continuing to advance your work? The last few years has seen purely supervised techniques make a lot of headroom riding off the success of dropout, for instance.
Thank you so much for doing this AMA, it's great to have you here on /r/MachineLearning!
[–]yoshua_bengioProf. Bengio 22 指標 12 月 前
I have no clear plan for a MOOC but I might do one eventually. In the meantime, I write a new and more complete book on deep learning (with Ian Goodfellow and Aaron Courville). Some draft chapters should come out in the next few months and feedback from the community and students would be great. Note that Hugo Larochelle (formerly a PhD with me and a post-doc with Hinton) has great videos on deep learning http://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH (and slides on his web page).
I believe that the recent surge of interest in NNets just means that the machine learning community wasted many years not exploring them, in the 1996-2006 decade, mostly. There is also hype, especially if you consider the media. That is unfortunate and dangerous, and will be exploited especially by companies trying to make a quick buck. The danger is to see another bust when wild promises are not followed by outstanding results. Science mostly moves by small steps and we should stay humble.
I have no crystal ball but I believe that improving our ability to model joint distributions (either in an unsupervised way or conditioned on some input, either explicitly or implicitly through learning of good representations) is going to be crucial for future progress of deep learning towards AI-level machine understanding of the world around us.
Another easy prediction is that we need to and will make progress towards efficiently training much larger models. This involves improvements in the way we train model (the numerical optimization involved), as well as in ways to do it computationally more efficiently (e.g. through parallelization and other tricks that avoid doing the computation associated with all the parts of the network for every example).
You can find out more in my arxiv paper on "looking forward": http://arxiv.org/abs/1305.0445
[–]Sigmoid_Freud 14 指標 1 年 前
Traditional (deep or non-deep) Neural Networks seem somewhat limited in the sense that they cannot keep any contextual information. Each datapoint/example is viewed in isolation. Recurrent Neural Networks overcome this, but they seem to be very hard to train and have been tried in a variety of designs with apparently relatively limited success.
Do you think RNNs will become more prevalent in the future? For which applications and using what designs?
Thank you very much for taking your time to do this!
[–]yoshua_bengioProf. Bengio 16 指標 1 年 前
Recurrent or recursive nets are really useful tools for modelling all kinds of dependency structures on variable-sized objects. We have made progress on ways to train them and it is one of the important areas of current research in the deep learning community. Examples of applications: speech recognition (especially the language part), machine translation, sentiment analysis, speech synthesis, handwriting synthesis and recognition, etc.
[–]omphalos 2 指標 1 年 前
I'd be curious to hear his thoughts on any intersection between liquid state machines (one approach to this problem) and deep learning.
[–]yoshua_bengioProf. Bengio 11 指標 1 年 前*
Liquid state machines and echo state networks do not learn the recurrent weights, i.e., they do not learn the representation. Instead, learning good representations is the central purpose of deep learning. In a way, the echo-state / liquid state machines are like SVMs, in the sense that we put a linear predictor on top of a fixed set of features. The features are functions of the past sequence through the smartly initialized recurrent weights, in the case of echo state networks and liquid state machines. Those features are good, but they can be even better if you learn them!
[–]yoshua_bengioProf. Bengio 6 指標 12 月 前
See the answer I already gave there:http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfpboj8
[–]Noncomment 3 指標 12 月 前
Did you mean recursion?
[–]omphalos 2 指標 12 月 前
Thank you for the reply. Yes I understand the analogy to SVMs. Honestly I was wondering about something more along the lines of using the liquid state machine's untrained "chaotic" states (which encode temporal information) as feature vectors that a deep network can sit on top of, and thereby construct representations of temporal patterns.
[–]rpascanu 3 指標 12 月 前
I would add that ESNs or LSMs can provide insights in why certain things don't work or work for RNNs. So having a good grasp of them could definitely be useful for deep learning. An example is Ilya's work on initialization (jmlr.org/proceedings/papers/v28/sutskever13.pdf), where they show that an initialization based on the one proposed by Herbert Jaeger for ESNs is very useful for RNNs as well.
They also offer quite a strong baseline most of the time.
[–]freieschaf 2 指標 1 年 前
Take a look at Schmidhuber's page on RNNs. There is quite a lot of info on them, and especially on LSTMNN, an architecture of RNN designed precisely for tackling the issue of vanishing gradient when training RNNs and so allowing them to keep track of a longer context.
[–]PasswordIsntHAMSTER 13 指標 1 年 前
Hi Prof. Bengio, I'm an undergrad at McGill University doing research in type theory. Thank you for doing this AMA!
Questions:
My field is extremely concerned with formal proofs. Is there a significant focus on proofs in machine learning too? If not, how do you make sure to maintain scientific rigor?
Is there research being done about the use of deep learning for program generation? My intuition is that eventually we could use type theory to specify a program and deep learning to "search " for an instantiation of the specification, but I feel like we're quite far from that.
Can you give me examples of exotic data structure used in ML?
How would I get into deep learning starting from zero? I don't know what resources to look at, though if I develop some rudiments I would LOVE to apply for a research position on your team.
[–]yoshua_bengioProf. Bengio 10 指標 12 月 前
There is a simple way that you get scientific rigor without proof, and it's used throughout science: it's called the scientific method, and it relies and experiments and hypothesis-testing ;-) Besides, math is getting into more deep learning papers. I have been interested for some time in proving properties of deep vs shallow architectures (see papers with Delalleau, and more recently with Pascanu). With Nicolas Le Roux I worked on the approximation properties of RBMs and DBNs. I encourage you to also look at the papers by Montufar. Fancy math there.
Deep learning from 0? there is lots of material out there, some listed in deeplearning.net:
My 2009 paper/book (a new one is on the way!): http://www.iro.umontreal.ca/~bengioy/papers/ftml_book.pdf
Hugo Larochelle's neural networks course & youtube videos: http://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH (slides on his webpage)
Practical recommendations for training deep nets: http://www.google.com/url?q=http%3A%2F%2Farxiv.org%2Fabs%2F1206.5533&sa=D&sntz=1&usg=AFQjCNFJClbJs-wyBb46aPwER1ZfOB_kng
A recent review: https://arxiv.org/abs/1206.5538
[–]PokerPirate 2 指標 1 年 前
On a related note, I am doing research in probabalistic programming languages. Do you think there will ever be a "deep learning programming language" (whatever that means) that makes it easier for nonexperts to write deep learning models?
[–]ian_goodfellow[S] 5 指標 12 月 前
I am one of Yoshua's graduate students and our lab develops a python package called Pylearn2 that makes it relatively easy for non-experts to do deep learning:
https://github.com/lisa-lab/pylearn2
You'll still need to have some idea of what the algorithms are meant to be doing, but at least you won't have to implement them yourself.
[–]nxvd 5 指標 1 年 前
It's not a programming language in the usual sense, but Theano is a pretty neat way to describe and train neural network architectures, however deep they are and whatever their characteristics. It's actually developed by people in Dr. Bengio's lab if I'm not mistaken.
[–]serge_cell 2 指標 1 年 前
IMHO definitely should be. There are several open source packages with similar functionality right now, and different research papers refer to different packages for results reproduction. Would be great if one wouldn't have to install and learn new package to reproduce result, but just use ready made cfg or script in dl language. Would improve reproducibility too - results reproduced with different implementation are more relatable.
[–]PokerPirate 1 指標 1 年 前
links?
[–]serge_cell 3 指標 1 年 前
I mostly familiar with convolutional networks, so most of packages here are for CNN and autoencoders
Fastest:
1. cuda-convnet - most used gpgpu implementation, used in other packages too
https://code.google.com/p/cuda-convnet/ there are also several forks on github
2. caffe
https://github.com/BVLC/caffe
3. NNforge
http://milakov.github.io/nnForge/
Based on cuda-convnet, but include more staff:
4. pylearn2
https://github.com/lisa-lab/pylearn2
other staff:
http://deeplearning.net/software_links/
[–]polyguo 2 指標 1 年 前
What probabilistic programming languages are you researching? Any experience with Church? I have an internship this summer with someone who does research using PPLs and it would be immensely useful to me if you could point me to resources that would allow me to get more familiar with the subject matter. Papers and actual code would be best.
[–]PokerPirate 1 指標 1 年 前
Have you been to http://probmods.org? It's a pretty thorough tutorial.
[–]polyguo 2 指標 1 年 前
I'm actually taking the probabilistic graphical models course in Coursera and i got a copy of Koller's book. I'm familiar with the theory, I've yet to see mature code written in PPLs.
And, yes, I've been to the site. I'm actually going to be working with one of the authors.
[–]PokerPirate 1 指標 1 年 前
me too :)
[–]dwf 1 指標 1 年 前
Machine learning is a big field. The folks who submit to COLT would be big on proofs. Others, not as much. Empirical study counts for a lot.
[–]orwells1 1 指標 12 月 前
Can't see a reply so this might help:
Ilya Sutskever https://vimeo.com/77050653 2013, 1:05:13
[–]wardnath 15 指標 1 年 前*
Dr. Bengio, In your paper Big Neural Networks Waste Capacity you suggest that gradient descent does not work as well with a lot of neurons as it does with fewer. (1) Why do the increased interactions create worse local minima? (2) Do you think hessian free methods like in (Martens 2010) are sufficient to overcome these issues?
Thank You!
Ref: Dauphin, Yann N., and Yoshua Bengio. "Big neural networks waste capacity." arXiv preprint arXiv:1301.3583 (2013).
Martens, James. "Deep learning via Hessian-free optimization." Proceedings of the 27th International Conference on Machine Learning (ICML-10). 2010.
[–]dhammack 9 指標 1 年 前
I think the answer to this one is that the increased interactions just lead to more curvature (off diagonal Hessian terms). Gradient descent, as a first-order technique, ignores curvature (it assumes the Hessian is the identity matrix). So what happens is that gradient descent is less effective in bigger nets because you tend to "bounce around" minima.
[–]yoshua_bengioProf. Bengio 9 指標 1 年 前
This is essentially in agreement with my understanding of the issue. It's not clear that we are talking about local minima, but what I call 'effective local minima', because training gets stuck (they could also be saddle points or other kinds of flat regions). We also know that 2nd order methods don't do miracles, in many cases, so something else is going on that we do not understand yet.
[–]ian_goodfellow[S] 10 指標 1 年 前
Verification post: https://plus.google.com/103174629363045094445/posts/2fqbkyYULAf
[–]hf98hf43j2klhf9 7 指標 1 年 前
We should try to request Yann LeCunn as well, he seems to be open to the idea.
[–]Megatron_McLargeHuge 11 指標 1 年 前
With the recent success of maxout and hinge activations, how relevant is the older work on RBM pretraining using various contrastive divergence tweaks? What do you think is still worth investigating about stochastic models?
How biologically plausible is maxout, and should we care?
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前*
The older work on RBM and auto-encoders is certainly still worth further investigation, along with the construction of other novel unsupervised learning procedures.
For one thing, unsupervised procedures (and pre-training) remain a key ingredient to deal with the semi-supervised and transfer learning cases (and domain adaptation, and non-stationary data), when the number of labeled examples of the new classes (or of the changed distribution) is small. This is how we won the two 2011 transfer learning competitions (held at ICML and NIPS).
Furthermore, looking farther into the future, unsupervised learning is very appealing for other reasons:
take advantage of huge quantitities of unlabeled data
learn about the statistical dependencies between all the variables observed so that you can answer NEW questions (not seen during training) about any subset of variables given any other subset
it's a very powerful regularizer and can help the learner to disentangle the underlying factors of variation, making much easier to solve new tasks from very few examples
it can be used in the supervised case when the output variable (to be predicted) is a very high-dimensional composite object (like an image or a sentence), i.e., a so-called structured output
Maxout and other such pooling units do something that may be related to the local competition (often through inhibitory interneurons) between neighboring neurons in the same area of cortex.
[–]ian_goodfellow[S] 3 指標 12 月 前
Right now pretraining does seem to be helpful for preventing overfitting in cases where there is very little labeled training data available. It now longer seems to be necessary as an optimization technique for deep networks, since we can just use the piecewise linear activation functions that are easy to optimize even for very deep networks.
Probabilistic models are still useful for tasks like classification with missing input (because they can reason about the missing inputs), or tasks where the goal is to repair damaged inputs (example: photo touchup) or infer the values of missing inputs, or where the task is just to generate realistic samples of data. It can also often be useful to have a probabilistic model that you use as part of a larger system. For example, if you want to use a neural net as part of an HMM, the HMM requires that its observation and transition models provide real probabilities.
Rectified linear units were partially motivated by biological plausibility concerns, because some neuroscientific evidence suggests that real neurons rarely operate in the regime where they reach their maximum firing rate.
I'm the grad student who came up with maxout, and I didn't have any biological plausibility concerns in mind when I came up with it. After I started using maxout for machine learning, another of Yoshua's grad students, Caglar Gulcehre, told me that there is some neuroscientific evidence for a function similar to maxout but with an absolute value being used in the deeper layers of the cortex. I don't know much about this myself. One thing about maxout that makes it a little bit difficult to explain in biological terms is the fact that maxout units can take on negative values. This is a bit awkward for a biological neurons since it's not possible to have a negative firing rate. But maybe biological neurons could use some average firing rate to indicate 0, and indicate negative values by firing less often than that.
My main interest is in engineering intelligent systems, not necessarily understanding how the human brain works. Because that's what my interest is, I am not very concerned with biological plausibility. Right now it seems easier to make progress in machine learning just by working from first principles than by reverse-engineering the brain. We don't have good enough sensor equipment to extract the kind of information from the brain that we would need to make reverse engineering it convenient.
[–]jkyle1234 13 指標 1 年 前*
Hello Prof. Bengio, thank you for the AMA. What recommendations would you have for someone who is not a PHD in getting started with Deep Learning.
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
See some of the pointers I put above:http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq7a3s
[–]32er234 1 指標 12 月 前
Something wrong with the link
[–]uber_kerbonaut 1 指標 12 月 前
maybe he's referring to this onehttp://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfpn5yp
[–][deleted] 12 指標 1 年 前
Dear Yoshua, thanks for doing this!
You are, to my knowledge, the only ML academic to publicly (and wonderfully!) speculate about the sociocultural perspectives afforded by the vantage of deep representation learning. In your fascinating article "Culture vs Local Minima" you touch on many important things, some of which I'm very curious about:
You describe how individuals learn by being immersed in culture. We both agree that they don't always learn very wholesome things. If you were king of the world, and you could prescribe a set of concepts that should be a part of every childhood learning trajectory, what would those be and to what end?
A corollary of "cultural immersion" is that the specific process of learning is not evident to the learner, the world simply "is" in a particular way. The author David Foster Wallace phrased this phenomenon as akin to fish having to figure out what water is. In your opinion, is this phenomenon an experiential byproduct of the neural architecture, or does it confer some learning benefit?
Why do you think that cultural trends become entrenched and cause their learners to fight to stay in (what could be argued to be) local optima - like e.g. the conflicts between various religious institutions and Enlightenment philosophy, or patriarchal society vs the suffragettes, etc.? Is this a case of very pernicious parameters, or is there some benefit to the learners in question?
Do you have an opinion on such concepts as mindfulness meditation, and if so, how do you think they relate to the exploration of "idea space"?
Again, thanks a lot for taking the time. In the space of human ideas you are a trailblazer, and we are immensely richer for your presence!
[–]yoshua_bengioProf. Bengio 9 指標 1 年 前
I am not a social scientist or a psychologist, so my opinions on these subjects should be taken as such. My opinion is that many learners stay entrenched in their beliefs because these beliefs have become part of their identity, their definition of who they are, and it's harder and scary to change that. There may also be a more computational aspect related to the notion of effective local minima (the optimization getting stuck). I believe that a lot of what our brain does is try to bring coherence to all of our experience, in order to construct a better model of the world. Mathematically, this may be related to the problem of inference, by which a learner searches for plausible explanations (latent variables) of the observed data. In stochastic models, inference is done by a form of stochastic exploration of configurations (and a Markov chain really looks like a series of free associations). Meditation and other time spent not doing anything directed but just thinking may well be useful to help us explore in this way. Sometimes it clicks, i.e., we find an explanation that fits well with many things. This is also how scientific ideas often seem to emerge (for me at least).
[–]yoshua_bengioProf. Bengio 10 指標 1 年 前
Verification post: https://plus.google.com/112504130537129706790/posts/eqdBAysAyqR
[–]vondragon 8 指標 1 年 前
I live in Montreal, working in the technology startup world. Very interested in your work, thank you for doing this AMA Professor Bengio. I worked hard to filter down to one question:
There seems to be a lot of disinterest from Machine Learning specialists and academics in general towards ML competitions hosted by Kaggle and the like. I recognize the odds of winning are quite low, making a the return on the investment of your time even worse, but it would seem to be even worse for ML enthusiasts that are flocking to participate. It would seem a few hours from an ML domain expert could be really beneficial on the right open datasets. Can you imagine an open, collaborative approach to competitive machine learning where experts and enthusiasts work effectively together?
[–]EJBorey 10 指標 1 年 前
Here's an example where experts won a Kaggle contest: http://blog.kaggle.com/2012/11/01/deep-learning-how-i-did-it-merck-1st-place-interview/ And here, where they won the Netflix Prize: http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html
But I think the reason why they don't work on the problems is that the bad ML researchers won't win and therefore not publish, while the good ones would get paid millions of dollars by companies to answer the same questions! Why do it for free?
[–]vondragon 6 指標 1 年 前
I would estimate that a majority of the time ML 'experts' do win the competitions, but they might not be recognized experts.
When a "non-expert" does win, they typically make up for their lack of domain sepecific ML knowledge by being an expert in a related domain like stats, math, programming, etc.
I think the dataset is an important factor to conisider here. Is it possible for an ML researcher to spend an insignificant amount of their time to apply some of their knoweldge building the model, at which point a larger crowd of less specialized people can compete on the remaining work?
[–]PasswordIsntHAMSTER 2 指標 1 年 前
I'm in Montreal too, where do you work? o.O
[–]vondragon 1 指標 1 年 前
Near Sherbrooke =D
[–]dwf 2 指標 12 月 前
ML researchers are usually trying to push the methodological envelope, but that's often not required to solve some arbitrary domain problem. Usually dealing with the mountain of annoyances of real-world data sources is what takes up the majority of the time, and then a random forest, boosted tree ensemble or SVM will do an acceptable job (especially compared to the usually pitiful posted baseline). Doing really, really well may require some finesse but also a large time investment, that won't typically be rewarded in an academic incentive structure (as far as being rewarded monetarily, there's also something seriously wrong with the economics of Kaggle, as is well-articulated by this lightning talk; anyone who's any good and has a clue what they're worth won't bother).
In short, winning competitions is usually only useful to an academic if it demonstrates a particular research-related point.
[–]marvinalone 9 指標 1 年 前
What's your opinion of Solomonoff Induction and AIXI? I'm just starting to read up on the topic, and I can't quite decide whether it's serious work, or a fringe theory by a small group of people who all cite each other.
[–]dylanbyte 2 指標 1 年 前
I am interested in this also.
[–]eaturbrainz 2 指標 1 年 前
Not Bengio, but reasonably well-versed in this specific topic.
It's serious work by theoreticians. You need a freaking Turing oracle to make those algorithms work, and all the relevant proofs are about global optimality in the presence of that Turing oracle, not about how good a learning/error rate you're going to get out of a finite sample with limited computing power (as you're going to need to build real algorithms).
That said, Schmidhuber and Hutter (who invented AIXI) have publication and competition records like nobody fucking else.
[–]dwf 2 指標 12 月 前
I'll just say that while the IDSIA group's competition record and benchmark results are impressive, it's important to compare apples to apples. Comparing a method that uses elastic distortions and other dataset augmentation strategies against a method that doesn't doesn't tell you anything about either method; it's been known for decades that more data helps, and that you can sometimes acquire more data by artificially augmenting a given training set with distortions. It's important to not conflate impressive engineering with scientific novelty.
[–]EJBorey 9 指標 1 年 前
We have all been hearing about the performance achievable via deep learning (in academic journals such as the New York Times, no less!). I've also heard that it's difficult for non-experts to get these techniques to work: Ilya Sutskever says that there is a weighty oral tradition about the design and training of deep networks and that the best way to learn how is to work for years with someone who is already an expert (source: http://vimeo.com/77050653).
I studied machine learning but not deep learning. Going back to grad school is not really an option for me. How can I learn how to design, build, and train deep neural networks without access to the oral tradition? Could you write it down for us somewhere?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
See the pointers I put above:http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq6wf0
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq7a3s
[–]EJBorey 1 指標 12 月 前
The second link is broken.
Do Hugo Larochelle's videos answer the questions here:http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq4rvi ?
[–]dylanbyte 2 指標 1 年 前
Related to this: would it be possible to use a Bayesian approach to try and encode some of this folk-lore knowledge?
What is the road-map to making deep learning accessible to all?
Thank you.
[–]yoshua_bengioProf. Bengio 8 指標 12 月 前
Hyper-parameter optimization has already been found to be a useful way to (partially) automate the search for good configurations in deep learning.
The idea is to automate the process of selecting the knobs, bells and whistles of machine learning algorithms, and especially of deep learning algorithms. We call such "knobs" hyper-parameters. They are different from the parameters that are learned during training, in that they are typically set by hand, by trial and error, or through a dumb and extensive exploration of all combinations of values (called "grid search"). Deep learning and neural networks in general involve many more such knobs to be tuned, and that was one of the reasons why many practitioners stayed far from neural networks in the past. It gave the impression of deep learning as a "black art", and it remains true that strong expertise helps a lot, but the research on hyper-parameter optimization is helping to move towards a more fully automated deep learning.
The idea of optimizing hyper-parameters is old, but had not had as much visible success until recently. One of the main early contributors to this line of work (before it was applied to machine learning hyper-parameter optimization) is Frank Hutter (along with collaborators), who devoted his PhD thesis (2009) to algorithms for optimizing knobs that are typically set by hand in general in software systems. My former PhD student James Bergstra and I worked on hyper-parameter optimization a couple of years ago and we first proposed a very simple alternative, called "random sampling" to standard methods (called "grid search"), which works very well and is very easy to implement.
http://jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf
We then proposed using for deep learning the kinds of algorithms Hutter had developed for other contexts, called sequential optimization and this was published at NIPS'2011, in collaboration with another PhD student who devoted his thesis to this work, Remi Bardenet, and his supervisor Balazs Kegl (previously a prof in my lab, now in France).
http://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf
This work has been followed up very successfully by researchers at U. Toronto, including Jasper Snoek (then a student of Geoff Hinton), Hugo Larochelle (who did his PhD with me) and Ryan Adams (now a faculty at Harvard) with a paper at NIPS'2012 where they showed that they could push the state-of-the-art on the ImageNet competition, helping to improve the same neural net that made Krizhevsky, Sutskever and Hinton famous for breaking records in object recognition.
http://www.dmi.usherb.ca/~larocheh/publications/gpopt_nips.pdf
Snoek et al put out a software that has since been used by many researchers, called 'spearmint', and I found out recently that Netflix has been using it in their new work aiming to take advantage of deep learning for movie recommendations:
http://techblog.netflix.com/2014/02/distributed-neural-networks-with-gpus.html
[–]james_bergstra 1 指標 12 月 前*
Plug for Bayesian Optimization and Hyperopt:
FWIW my take is that Bayesian Optimization + Experts designing the search spaces for SMBO algorithms is the way to deal with this: e.g. other post and ICML paper on tuning ConvNets
The Hyperopt Python package provides SMBO for ConvNets, NNets, and (soon) a range of classifiers from scikit-learnhyperopt-sklearn.
Sign up for Hyperopt-announce to get alerts about new stuff such as upcoming Gaussian-Process and regression-tree-based SMBO search algorithms similar to Jasper Snoek's Spearmint and Frank Hutter's SMAC software.
[–]EJBorey 2 指標 1 年 前
Actually, I wasn't asking about the Bayesian optimization work that Jasper Snoek et al. are doing, because I don't think it will be possible to automate away all human judgement in the design of these things. Rather, I wanted to know how to quickly acquire the necessary intuition without postdoc-ing in Bengio, Hinton, or LeCunn's labs.
Deep learning will never be practical if there's only 10 people on the planet who can get it to work! Is there a way to quickly become one of the savants?
[–]orwells1 1 指標 12 月 前*
Hello, same here. I fit the bill of their intended phd students (according to Y. Lecun's page, awesome math + coder), but wanted to avoid more phd/post-docs. I went through a reasonable number of papers, but in most there are either explanations missing or later the authors comment online on the "human in the loop optimization"/"tricks of the trade"/"black magic". I'm not sure if I should be investing much more of my time alone, if the full knowledge is not there. Is it? Thanks a lot for doing this!
[–]serge_cell 9 指標 1 年 前
Hi Prof. Bengio, There were some work on applying "higher" math - algebraic/tropical geometry, category theory, to deep learning. Notably, John Healy several years ago claimed improving neural net (ART1) with category theory. What's your opinion on this approach? Will it be only toy model in foreseeable future, or there is some promise in this approach in your opinion?
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
See the above suggestions http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq7a3sRegarding algebraic/tropical geometry, look at the work of Morton & Montufar.
[–]polyguo 2 指標 1 年 前
Source? I'm extremely interested in the intersection between Programming Language Theory and Machine Learning. This seems to be right there.
[–]serge_cell 2 指標 1 年 前
Healy:
http://www.ece.unm.edu/~mjhealy/
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.98.6807
Tropical geometry
Tropical geometry of statistical models
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.242.9890
[–]n_dimensional 8 指標 1 年 前
Dear Prof. Bengio,
I am about to finish my PhD in computational neuroscience and I am very interested in the "gray area" between neuroscience and machine learning.
What aspects of brain computation do you think are (or will be) most relevant for machine learning?
If you could know the answer to one question about how the brain computes information, what would that be?
Thanks!
[–]yoshua_bengioProf. Bengio 6 指標 12 月 前
Understanding how learning proceeds in brains is clearly the subject most relevant to machine learning. We don't have a clue of how brains can learn in the kinds of efficient ways that we are able to implement in artificial neural networks, so this could be really important, and a place where information could flow both ways between machine learning research and computational neuroscience.
[–]exellentpossum 21 指標 1 年 前
When asked about sum product networks, one of the original Google Brain team members told me he's not interested in tractable models.
What's your opinion about sum product networks? They made a big splash at NIPS one year and now they've disappeared.
[–]yoshua_bengioProf. Bengio 6 指標 1 年 前
There are many kinds of intractabilities that show up in different places with various learning algorithms. The more tractable the easier to deal with in general, but it should not be at the price of losing crucial expressive power. I don't have a sufficiently clear mental fix on the expressive power of SPNs to know who much we lose (if any) through this parametrization of a joint distribution. In any case, all the interesting models that I know of suffer from intractability of minimizing the training criterion wrt the parameters (i.e. training is fundamentally hard, at least in theory). SVMs and other related kernel machines do not suffer from that problem, but they may suffer from poor generalization unless you provide them with the right feature space (which is precisely what is hard, and what deep learning is trying to do).
[–]celestec 3 指標 1 年 前
Hi exellentpossum, I am studying some machine learning on my own, and have not yet come across "tractable models." What exactly is a tractable model? (Searching on my own didn't help much...) Sorry if this is a dumb question.
[–]exellentpossum 3 指標 1 年 前
In the context of sum product networks, it means that inference is tractable or doesn't suffer from the exponential growth in computational cost when you add more variables.
This comes at a price though, sum product networks can only represent certain types of distributions. More specifically, probability distributions where its parameterization can be expressed as a product of factors (when multiplied out this creates a much larger polynomial). I'm not sure of the exact scope of distributions this encompasses, but it does include hierarchical mixture models.
[–]Scrofuloid 3 指標 1 年 前*
Not quite. All graphical models can be represented as products of factors, and deep belief networks and such are special cases of graphical models. Inference in graphical models is usually considered intractable in the treewidth of the graph. So, in conventional graphical model wisdom, low-treewidth graphical models were considered 'tractable', and high-treewidth models were 'intractable', so you'd have to use MCMC or BP or other approximate algorithms to solve them.
Any graphical model can be compiled into an SPN-like structure (an arithmetic circuit, or AC). The problem is that in the worst-case, the resulting circuit can be exponentially large. So even though inference is still linear in the size of the circuit, it's potentially exponential in the size of the original graphical model. But it turns out certain high-treewidth graphical models can still be compiled into compact circuits, so you can still do efficient inference on them. This means that there are certain high-treewidth graphical models on which inference is tractable -- kind of a surprise to the graphical models community.
You can think of ACs and SPNs as a way to compactly represent context-specific independences. They can compactly represent distributions that would result in high-treewidth graphical models if you tried to represent them in the usual graphical models way. The difference between ACs and SPNs is that ACs are compiled from Bayesian networks, as a means of performing inference on them. SPNs directly use the circuit to represent a probability distribution. So instead of training a graphical model and hoping you can compile it into a compact circuit (AC), you directly learn a compact circuit that fits your training data (SPN).
[–]exellentpossum 1 指標 1 年 前
I agree, SPNs can represent any probability distribution. But there is a certain set which can be represented efficiently. Can you be more specific about this set of distributions which can take advantage the factorization property of SPNs (a distribution with a reasonably sized circuit)?
[–]Scrofuloid 1 指標 1 年 前
Hm. I don't know if there's a one-line way to characterize that set of distributions. It includes all low-treewidth graphical models, and some high-treewidth distributions with context-specific independences. Poon & Domingos' paper had a section relating SPNs to various other representations.
[–][刪除] 1 年 前
[deleted]
[–]BeatLeJuce 7 指標 1 年 前
Why do Deep Networks actually work better than shallow ones? We know a 1-Hidden-Layer Net is already an Universal Approximator (for better or worse), yet adding additional fully connected layer usually helps performance. Were there any theoretical or empirical investigations into this? Most papers I read just showed that they WERE better, but there were very few explanations as to why -- and if there was any explanation. then it was mostly speculation.. what is your view on the matter?
What was your most interesting idea that you never managed to publish?
What was funniest/weirdest/strangest paper you ever had to peer-review?
If I read your homepage correctly, you teach your classes in French rather than English. Is this a personal preference or mandated by your University (or by other circumstances)?
[–]yoshua_bengioProf. Bengio 6 指標 12 月 前
Being a universal approximator does not tell you how many hidden units you will need. For arbitrary functions, depth does not buy you anything. However, if your function has structure that can be expressed as a composition, then depth could help you save big, both in a statistical sense (less parameters can express a function that has a lot of variations, and so need less examples to be learned) and in a computational sense (less parameters = less computation, basically).
I teach in French because U. Montreal is a French-language university. However, three quarters of my graduate students are non-francophones, so it is not a big hurdle.
[–]rpascanu 1 指標 12 月 前
Regarding 1, there are some work in this direction. You can check out these papers:
http://arxiv.org/abs/1312.6098 (about rectifier deep MLPs),
http://arxiv.org/abs/1402.1869 (about deep MLPs with piecewise-linear activations),
RBM_Representational_Efficiency.pdf,
http://arxiv.org/abs/1303.7461.
Basically the universal approximator theorem says that a one layer MLP can approximate any function if you allow yourself an infinite number of hidden units which in practice one can not do. One advantage of deep models over shallow one is that they can be (exponentially) more efficient at representing certain family of functions (arguably the family of functions we actually care about).
[–]shanwhiz 6 指標 1 年 前
We have seen deep learning work really well for image/video/sound. Do you foresee it working for text classification as well? Most papers that have tried text/document classification using deep learning have not done better than the conventional SVM/Bayes. What are your thoughts on this?
[–]yoshua_bengioProf. Bengio 9 指標 1 年 前
I predict that deep learning will have a big impact in natural language processing. It has already had an impact, in part due to an old idea of mine (from NIPS'2000 and a 2003 paper in JMLR): represent words by a learned vector of attributes, learned so as to model the probability distribution of sequences of words in natural language text. The current challenge is to learn distributed representations for sequences of words, phrases and sentences. Look at the work of Richard Socher, which is pretty impressive. Look at the work of Tomas Mikolov, who beat the state of the art in language models using recurrent networks and who found that these distributed representations magically capture some form of analogical relationships between words. For example, if you take the representation for Italy minus the representation for Rome, plus the representation for Paris, you get something close to the representation for France: Italy - Rome + Paris = France. Similarly, you get that King - Man + Woman = Queen, and so on. Since the model was not trained explicitly to do these things, this is really amazing.
[–]hapagolucky 10 指標 1 年 前
I see more and more pop media articles extolling deep learning as a panacea that will make AI a reality (Wired is especially guilty of this). Given the AI winters of the 1970's and 1980's that arose from overhyped expectations, what can deep learning and ML researchers and advocates do to mitigate this from happening again?
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
Stick to the scientific ways of demonstrating advances (which often is lacking from companies branding themselves as doing deep learning). Avoid overselling. Stay humble while not using our motivation associated with the long-term vision that brought us here in the first place.
[–][deleted] 7 指標 1 年 前
Hi Bengio. I'm a masters candidate in robotics, mostly doing reinforcement learning mushed together with some ML regression methods for the identification of interesting value functions and state space representations.
How is your work life balance? Do you have fun? What sorts of things do you do to unwind?
I'm considering doing a PhD, but I literally feel like just getting a part-time job and doing independent research, because the academic environment can be pretty stifling.
Also, Montreal seems really fun!
J
[–]yoshua_bengioProf. Bengio 16 指標 1 年 前
Life balance. That is tough. Many prominent scientists will tell you the same story. My inclination is to work as much as I can: that is probably part of the reasons for my early success, but this may threaten my health and personal life. We live in an environment which puts so much pressure on us that it is easy to forget that we are humans and we need breaks and to take care of our body (I have some health issues that I cannot just ignore) and our relationships with other humans. Some kind of self-discipline helps, but I found that what works best is to cultivate what is rewarding and pleasurable and the same time is good for me and my physical and emotional well-being. For example I like very much to walk (many ideas come!), not to speak about eating healthily and enjoying a romantic relationship based on authenticity and where I can really be myself.
Oh, and yes, Montreal IS fun ;-)
The advantage of academia is that you can focus on research and that you can benefit enormously from the interactions with other researchers. Research is a collective enterprise. This is NOT like what you tend to see in science-fiction movies. Never forget that!
[–][deleted] 1 指標 12 月 前
This is really refreshing to hear!
I have been struggling with balance as well. I think I should find my balanced way of being a scientist as well, and find a supervisor who wants to be my long term colleague and friend - not just a pedantic sort of guide and disciplinary figure. Perhaps giving up on academia is the easy way out. Perhaps what I really need to do is make more inspirational friends, and help join and build the community I want to be a part of.
Thanks so much for the candid response! It's very eye opening. I hope you keep being awesome and inspiring people like me! (but no so much that we keep losing so much sleep on our work :p)
[–]Derpscientist 6 指標 1 年 前*
Dr Bengio,
I'd like to thank you for the amazing research and software(theano, pylearn2) that your lab has contributed.
What are your feelings on Hinton and LeCun moving to industry?
What about academia and publishing your research is more valuable than the floating point overflow of money you could make at private companies?
Are you nervous that machine learning will go the way of time-series analysis, where a lot of advanced research takes place behind closed doors because the intellectual property is so valuable?
Given the recent advancements in training discriminative neural networks, what role do you envision generative neural networks play in the future?
[–]yoshua_bengioProf. Bengio 8 指標 12 月 前
I think that with Hinton & LeCun in industry, there will be more rapid advance in applying deep learning to really interesting and large-scale problems. The down side may be a temporarily reduced offer in terms of supervising new graduate students for deep learning. However, there are many young faculty who are at the forefront of deep learning research and who are eager to take new strong students. And the fact that deep learning is being used heavily in industry means that more students get to know about the field and are excited to jump into it.
Personally, I prefer the freedom of academia over more zeros in my salary. See also what I wrote above:http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfpbc1g
I believe that a lot of research will continue to happen in academia and that in the large industrial labs the incentive to publish will remain high.
I think that generative networks are very important for the future. See what I wrote above about unsupervised learning (the two are not synonym, but often come together, especially since we found the generative interpretation of auto-encoders, see the work with Guillaume Alain, http://arxiv.org/pdf/1305.6663.pdf):
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq7v4v
[–]quaternion 1 指標 1 年 前
Could you provide additional info on who and what you are referring to with time series analysis?
[–]tryolabs_feco 9 指標 1 年 前*
Hi Yoshua, very excited about this AMA, thank you for your time. I have a few questions:
- What are the biggest challenges in ML nowadays?
- What are the most interesting and/or creative ways you have seen people/businesses using ML?
- What does the future of Machine Learning look like?
[–]freieschaf 4 指標 1 年 前
Last year I did my undergrad thesis on NLP using probabilistic models and neural networks partly inspired by your work. I became interested and at that point I considered doing further work on NLP. Currently I am pursuing an MSc degree taking several related courses.
But, after several months, I haven't found NLP to be as motivating as I was expecting it to be; research on this area seems to be a little stagnant, from my limited point of view. What do you think are some challenges that are making or going to make this field move forward?
Thanks for taking the time to answer some questions here!
[–]yoshua_bengioProf. Bengio 8 指標 1 年 前
I believe that the really interesting challenge in NLP, which will be the key to actual "natural language understanding", is the design of learning algorithms that will be able to learn to represent meaning. For example, I am working on ways to model sequences of words (language modeling) or to translate a sentence in one language into a corresponding one in another language. In both of these cases we are trying to learn a representation of the meaning of a phrase or sentence (not just of a single word). In the case of translation, you can think of it like an auto-encoder: the encoder (that is specialized to French) can map a French sentence into its meaning representation (represented in a universal way), while a decoder (that is specialized to English) can map this to a probability distribution over English sentences that have the same meaning (ie. you can sample a plausible translation). With the same kind of tool you can obviously paraphrase, and with a bit of extra work, you can do question answering and other standard NLP tasks. We are not there yet, and the main challenges I see have to do with numerical optimization (it is difficult not to underfit neural networks, when they are trained on huge quantities of data). There are also more computational challenges: we need to be able to train much larger models (say 10000x bigger), and we can't afford to wait 10000x more time for training. And parallelizing is not simple but should help. All this will of course not be enough to get really good natural language understanding. To to this well would basically allow to pass some Turing test, and it would require the computer to understand a lot of things about how our world works. For this we will need to train such models with more than just text. The meaning representation for sequences of words can be combined with the meaning representation for images or video (or other modalities, but image and text seem the most important for humans). Again, you can think of the problem as translating from one modality to another, or of asking whether two representations are compatible (one expresses a subset of what the other expresses). In a simpler form, this is already how Google image search works. And traditional information retrieval also fits the same structure (replace "image" by "document").
[–]akshayxyz 1 指標 12 月 前
I am not from academia, but ever since I have started following machine learning stuff, I keep getting interesting ideas/problems to solve. Here is one I got few years back.
You take simple math word problems, e.g. simple ratio/proportion, rate/motion, age, give/take etc. word problems, they can (have to) be translated to a bunch of constants, unkown(s) and math relations/concepts, eventually to find some unknown(s). And every one who understands the concepts, will come up with similar equations, and definitely one correct answer. You can view it as a NLP problem.. How to solve it? Well I don't know, may be trying to first extract basic concepts/relations from standard (and simple) word problems?
Thinking aloud - you may start by doing something like "part of (math) speech" tagging...or, get some labeled data ( problem -> math equation), and see if you can find some hidden factors/relations defining the translations...
[–]deeperredder 4 指標 1 年 前*
While deep nets have helped move the state of the art forward in natural language text understanding, the improvements there haven't really been significant. Where do you think significant progress can come from in that field?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
I do think that significant progress will come in the area of natural language processing, most importantly, natural language understanding. Progressively, though (because full understanding is essentially AI-level understanding of the world around us). See my previous answer:
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfpje92
[–]CyberByte 11 指標 1 年 前
What will be the role of deep neural nets in Artificial General Intelligence (AGI) / Strong AI?
Do you believe AGI can be achieved (solely) by further developing these networks? If so: how? If not: why not, and are they still suitable for part of the problem (e.g. perception)?
Thanks for doing this AMA!
[–]davidscottkrueger 6 指標 12 月 前
Hi! My name's David Krueger; I'm a Master's student in Bengio's lab (LISA).
My response is: it is not clear what their role will be. AGI may be theoretically achievable solely by developing NNs, (especially if we include RNNs), but this is not how it will actually take place.
What incompetentrobot said is literally false, but there is a kernel of truth, which is that Deep Learning (so far) just provides a set of methods for solving certain well-defined types of general Machine Learning problems (such as function approximation, density estimation, sampling from complex distributions, etc.).
So the point is that the contributions of the Deep Learning community haven't been about solving fundamentally new kinds of problems, but rather finding better ways to solve fundamental problems.
[–]willis77 8 指標 1 年 前
Have you observed practical applications where deep learning succeeds but traditional ML fails? i.e. not simply improving the state of the art on an image benchmark by X%, but a case where an intractable problem is made tractable, solely via deep learning?
[–]yoshua_bengioProf. Bengio 9 指標 12 月 前
There is a constructed task on which all the traditional black-box machine learning that were tried failed, and where some deep learning variants work reasonably well (and where guiding the hidden representation completely nails the task, showing the importance of looking for algorithms that can discover good intermediate representations that disentangle the underlying factors). Note that many deep learning approaches also failed so this is interesting. Seehttp://arxiv.org/abs/1301.4083. What's particular about this task is that it is the composition of two much easier tasks (detecting objects, performing a logical operation on the result), i.e., it intrinsically requires more depth than a simple object recognition task.
[–]SnowLong 2 指標 1 年 前*
I believe no one had commercially deployed system that could search untagged images up until deep convolutional nets hugely improved state of art on the ImageNet benchmark. It took less then half a year for Google to implement search in personal galleries after promising results were shown. So in a way traditional method failed - non were good enouph to actually put into production...
[–]Should_I_say_this 7 指標 1 年 前
Can you describe what you are currently researching, first by bringing us up to speed on the current techniques used and then what you are trying to do to advance that?
[–]SnowLong 8 指標 1 年 前
I think your question was answered by Yousua here:
Deep Learning of Representations: Looking Forward
Yoshua Bengio
arXiv:1305.0445v2 [cs.LG] 7 Jun 2013
[–]Should_I_say_this 1 指標 1 年 前
This is excellent thanks!
[–]dwf 4 指標 12 月 前
Following on work Ian and I did on maxout, I recently did some work empirically interrogating how and why dropout works, focusing on the rectified linear case. More recently I've been working on hyperparameter optimization.
[–]exellentpossum 3 指標 1 年 前
It would be cool if members from Bengio's group could also answer this (like Ian).
[–]rpascanu 7 指標 12 月 前
I've done some work lately on the theory side (showing that deep models can be more efficient than shallow ones):
http://arxiv.org/abs/1402.1869
http://arxiv.org/abs/1312.6098
I've been spending quite a bit of time on natural gradient, and I'm currently exploring variants of the algorithm, and I'm interested in how it addresses non-convex optimization specific problems.
And, of course, recurrent networks which have been the focus of my PhD since I started. Particularly I worked on understanding the difficulties of training them (http://arxiv.org/abs/1211.5063) and how depth can be added to RNNs (http://arxiv.org/abs/1312.6026).
[–]caglargulcehre 5 指標 12 月 前
Hi, My name is Caglar Gulcehre and I am PhD student at Lisa lab. You can access my academic page from here,http://www-etud.iro.umontreal.ca/~gulcehrc/.
I have done some works related to Yoshua Bengio's "Culture and Local Minima" paper, basically we focused on empirically validating the optimization difficulty on learning high level abstract problems:http://arxiv.org/abs/1301.4083
Recently I've started working on Recurrent neural networks and we have a joint work with Razvan Pascanu, Kyung Hyun Cho and Yoshua Bengio: http://arxiv.org/abs/1312.6026
I've also worked on a new kind of activation function in which we claim to be more efficient in terms of representing complicated functions compared to regular activation functions i.e, sigmoid, tanh,...etc:
http://arxiv.org/abs/1311.1780
Nowadays I am working on Statistical Machine Translation and learning&generating sequences using RNNs and what not. But I am still interested in optimization difficulty for learning high level(or abstract) tasks.
[–]ian_goodfellow[S] 5 指標 1 年 前
I'm helping Yoshua write a textbook, and working on getting Pylearn2 into a cleaner and better documented state before I graduate.
[–]exellentpossum 1 指標 12 月 前
Any particular developments in deep learning that you're excited about?
[–]ian_goodfellow[S] 5 指標 12 月 前
I'm very excited about the extremely large scale neural networks built by Jeff Dean's team at Google. The idea of neural networks is that while an individual neuron can't do anything interesting, a large population of neurons can. For most of the 80s and 90s, researchers tried to use neural networks that had fewer artificial neurons than a leech. In retrospect, it's not very surprising that these networks didn't work very well, when they had such a small population of neurons. With the modern, large-scale neural networks, we have nearly as many neurons as a small vertebrate animal like a frog, and it's starting to become fairly easy to solve complicated tasks like reading house numbers out of unconstrained photos: http://www.technologyreview.com/view/523326/how-google-cracked-house-number-identification-in-street-view/ I'm joining Jeff Dean's team when I graduate because it's the best place to do research on very large neural networks like this.
[–]Letter_Guardian 3 指標 1 年 前
Hi Prof. Bengio,
Thank you for doing this AMA. Questions:
How much do you think we can actually accomplish in the big data challenge?
Do you think data alone is sufficient to solve practical problems, as opposed to use some kind of expert knowledge?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
At the end of the day there is only data. Expert knowledge is also coming from past experience: either communicated by some humans (recently, or in past generations, through cultural evolutio) or from genetic evolution (which also relies on experience to engrave knowledge into genes). What this may potentially say is that we may need different kinds of optimization methods and not just those based on local descent (like most learning algorithms).
All that being said, if I try to solve a practical problem in the short term, it can be very useful to use prior knowledge. There are many ways that this has been done in deep learning, either through preprocessing, architecture and/or training objective (e.g. especially through regularizers and pre-training strategies). However, I much prefer when the data can override the prior that is injected (and this is also theoretically more sound, as one consider that more and more data can be exploited).
[–]FuzzySets 3 指標 1 年 前
I'm currently finishing up my undergrad in philosophy of science and logic and I am trying to make the switch to computer science for masters work with the intention of pursuing machine learning at the phd level. Besides filling in the obvious knowledge gaps in mathematics and basic programming skills, what are some of the things a person in my position could do to make themselves a more attractive candidate for your field of work? Thanks so much for visiting us a r/MachineLearning!
[–]yoshua_bengioProf. Bengio 10 指標 12 月 前
Read deep learning papers and tutorials, starting from the introductory material and moving your way up. Take notes on your reading, trying to summarize what you learned.
Implement some of these algorithms yourself, from scratch, to make sure you understand the math for real, implementing variants of these, not just a copycat of a pseudo-code you found in a paper.
Play with these implementations on real data, maybe competing in Kaggle competitions. The point is that a lot is learned by actually putting your hands in data and playing with variants of these algorithms (this is true in general for machine learning).
Write about your experiences and results and thoughts in a blog. Initiate contact with researchers in the field and ask them if they would like to you to work remotely on some of the projects and ideas they have. Try to do an internship.
Apply to graduate school in a lab that actually does these things.
Is the roadmap clear enough?
[–]karmicthreat 3 指標 1 年 前
So I've had a desire to get deep into Deep Learning and general machine learning for a while. I'm currently taking the computational neurology course coursera offers. I'll follow that up with the ML and NN courses.
Where do you recommend someone go from there? I've not seen much that is at the grad level out there.
[–]last_useful_man 1 指標 1 年 前
https://www.coursera.org/courses?orderby=upcoming&search=computational%20neurology
(comes up empty) - care to clarify? Clinical neurology perhaps?
[–]karmicthreat 2 指標 1 年 前
Sorry, I meant computational neuroscience. Which makes sense, since neurology would be more the study of disorders of the nerves. Which while interesting I'm not really after that particular aspect of the CNS.
[–]last_useful_man 1 指標 1 年 前
Holy moly, that exists! https://www.coursera.org/course/compneuro
Awesome, thank you!
[–]lars_ 3 指標 1 年 前
Hi! The guys behind the Blue Brain project intend to build a working brain by reverse engineering the human brain. I heard Hinton be critical of this approach in a talk. I got the impression that he believed the kind of work that is done within ML would be more likely to lead to a general strong AI.
Let's imagine we are some time in the future, and we have created strong artificial intelligence - that passes the Turing test, and generally passes as alive and conscious. If we look at the code for this AI, do you think it would mostly be a result of reverse engineering the human brain, or would it be mostly made of parts that we humans have invented on our own?
[–]yoshua_bengioProf. Bengio 6 指標 12 月 前
I don't think that Hinton was critical of the idea of reverse-engineering the brain, i.e., to consider what we can learn from the brain in order to build intelligent machines. I suspect he was critical of the approach in which one tries to get all the details right without an overarching computational theory that would explain why the computation makes sense (especially from a machine learning perspective). I remember him making that analogy: imagine copying all the details of a car (but with an imperfect copy), putting them together, and then turning on the key and hoping for the car to move forward. It's just not going to work. You need to make sense of these details.
[–]redkk 3 指標 1 年 前
Hi Sir, I am a self-learner trying to train a sparse autoencoder with linear/relu units. What would be a suitable sparsity cost which is differentiable? I saw something that uses KL divergence but could not understand it. Is sparsity-inducing formula a holy grail or secret? Thanks, KK.
[–]yoshua_bengioProf. Bengio 5 指標 12 月 前
Not a holy grail or secret. With a denoising auto-encoder setup and rectifiers, you easily get sparsity, especially with an L1 penalty. With sigmoids you are better off with the KL divergence penalty. It just says that the output of the units should be close to some small target (like 0.05) in average, but instead of penalizing squared difference it uses the KL divergence, which is more appropriate for comparing probabilities. My colleague Roland Memisevic is more involved than I am in experimenting with such things and could probably tell you more.
[–]evc123 3 指標 1 年 前
Hi Prof Bengio,
Is it possible to get into Lisa-Lab without any Machine learning/Deep Learning publications? The university I'm attending does a tiny bit of research in computer vision, bioinformatics, and 1980s-era neural networks; but none of it as contemporary or as in-depth as the research at Lisa-Lab and the other labs listed on Deeplearning.net
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
We have taken such candidates recently, especially if they are strong in math and computer science. Note that we have pretty much filled the open positions for Fall 2014, though.
[–]ddebarr 3 指標 12 月 前
As EJBorey says, "I've heard that it's difficult for non-experts to get these techniques to work." Was is the most promising work being done to automate the configuration of deep learning networks? Thanks!
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
Please see this reply: http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq884k
[–]SnowLong 6 指標 1 年 前
Is there attempts to apply neural nets to the task of machine translation?
When do you think NN based approaches replace statistical methods in commercially deployed MT systems? I mean in speech recognition(all major industry players) and vision(Google, Baidu) tasks NNs are already deployed...
[–]yoshua_bengioProf. Bengio 5 指標 12 月 前
I just started a page that lists some of the papers on neural nets for machine translation:https://docs.google.com/document/d/1lqo5N1LzVWNPy1sYuujNa5vVNmyP5Zjv6VtEVgcFr6k
Briefly, since neural nets already beat n-grams on language modeling, you can first use them to replace the language-modeling part of MT. Then you can use them to replace the translation table (after all it's just another table of conditional probabilities). Other fun stuff is going on. The most exciting and ambitious approaches would completely scrap the current MT pipeline and learn to do end-to-end MT purely with a deep model. The interesting aspect of this is that the output is structured (it is a joint distribution over sequences of words), not a simple point-wise prediction (because there are many translations that are appropriate for a given source sentence).
[–]SnowLong 1 指標 12 月 前
Thank you! Insights help and I'm starting to read papers so thanx for the list too (:
[–]EJBorey 1 指標 1 年 前
Sure. Here's a New York Times article that talks about real-time machine translation from English into Mandarin:http://www.nytimes.com/2012/11/24/science/scientists-see-advances-in-deep-learning-a-part-of-artificial-intelligence.html
[–]SnowLong 3 指標 1 年 前
I saw that video from MS, very impressive one. But I do not believe MT part was done using NNs. Speech recognition - YES. Speech synthesis - most likely. MT - nope.
[–]Two-Tone- 5 指標 1 年 前
What are your thoughts on Google acquiring all of these different AI related companies the last year or so?
[–]totes_meta_bot 2 指標 1 年 前*
This thread has been linked to from elsewhere on reddit.
[/r/compsci] Deep learning pioneer Yoshua Bengio taking questions for his AMA in /r/MachineLearning
[/r/artificial] Deep learning pioneer Yoshua Bengio AMA: Thursday 1-2PM EST in /r/MachineLearning
[/r/Futurology] Deep learning pioneer Yoshua Bengio taking questions for his AMA in /r/MachineLearning
I am a bot. Comments? Complaints? Send them to my inbox!
[–]EJBorey 2 指標 1 年 前
Any advice on hiring your students? What is compelling to the modern machine learning PhD?
[–]kablunk 2 指標 1 年 前*
Sorry for being so mundane: What as yet unexplored fields do you see machine learning being applied to in the future?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
I would rather ask about fields where machine learning is NOT going to be applied ;-)
[–]sssub 2 指標 1 年 前
Dear Prof. Bengio,
In Neuroinformatics several researchers work in the field of 'Reservoir Computing' (random sparse RNN with a linear read-out which is trained). Comparing this architecture to 'Deep networks' I see a lot of similarities in both approaches. There seems to be a strong link between learning abstract features in deep architectures and plasticity mechanisms in spiking reservoirs.
I would very much like to hear your opinion on this
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
Biological motivation is indeed very interesting, but learning the recurrent weights is crucial to get computational competence, as I wrote there:
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfpboj8
[–]rpascanu 1 指標 12 月 前
Correct me if I'm wrong, but the Reservoir Computing paradigm assumes that the reservoir (or recurrent and input to hidden weight matrices) are randomly sampled (from carefully crafted distribution) and not learned. By plasticity mechanism you refer here to RC methods that use some local learning mechanism of the weights ?
If not, I believe one can answer your question along this line. Both RC approaches and DL approaches are trying to extract useful features from data. However RC does not learn this feature extractor, while DL does. Of course, as you pointed out, there are a lot of similarities. There are a lot of things DL could learn from RC research and the other way around it.
[–]sssub 1 指標 12 月 前
Yes, I am referring to local biologically-inspired learning mechanisms. An example being Spike-timing dependent plasticity (STDP) which is then investigated in reservoir systems. Such architectures look a lot like autoencoders.
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前*
"Looking a lot like" is interesting, but we need a theory of how this enables doing something useful, like capturing the distribution of the data, or approximately optimizing a meaningful criterion.
[–]US932H923 2 指標 12 月 前
Who are some of the people you have a lot of respect for?
What was the last fiction book that you've read?
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
I have a lot of respect for a lot of people! One clue is who I cite! Another is who I invite at the workshops and conferences I organize.
[–]m4linka 2 指標 12 月 前
Dear Prof. Bengio.
In my experience with using different neural networks models, it seems that either a good initialization (for example via pretraining, or the sort of guided learning) or the structure (think of the convolutional net) or standard regularization like l2 norm is crucial for learning. In my opinion all of them are special forms of the regularization. Therefore, it looks that 'without prior assumptions, there is no learning'. In the era of 'big data' we can slowly decrease the influence of the regularization part - and therefore develop more 'data-driven' approaches.
Nonetheless, still some form of regularization is needed. For me it seems there is a complexity gap between training networks from scratch (and keeping the regularization as small as possible), and using regularized networks (structure, l2 norm, pre-training, smart initialization, ...). Something like P-hard vs NP-hard in the complexity theory.
Are you aware of any literature that tackle this problem from the formal or experimental perspective?
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
In a theoretical sense, you would imagine that as the amount of data goes to infinity priors become useless. Not so, I believe. Not only because of the potentially exponential gains (in terms of number of examples saved) of some priors, but also because there are computational implications of some priors. For example, the depth prior can save you both statistically and computationally, when it allows you to represent a highly variable function with a reasonable number of parameters. Another example is the time for training. If (effective) local minima are an issue, then even with more training data, you would get stuck in poor solutions, that a good initialization (like pre-training) could avoid. Unless you make both the amount of data and computation resources to infinity (and not just "large"), I think some forms of broad priors are really important.
[–]m4linka 1 指標 12 月 前
That is interesting. Could you point out some literature on this topic?
[–]davidscottkrueger 1 指標 12 月 前
According to yesterday's talk, the private dataset network in this paper was trained without regularization, suggesting that with enough data it may not be needed (although it likely depends on the dataset/task).http://arxiv.org/pdf/1312.6082v2.pdf
[–]US932H923 2 指標 12 月 前
When you're learning something new, do you spend time trying to figure out how the learning process is happening in your own brain?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
Typically not. I get too excited when something clicks. My brain races and my urge is to write my understanding down or talk about it.
But at other times, I do marvel on this phenomenon and I think about it.
[–]DavidJayHarris 2 指標 12 月 前
Hi Professor Bengio, thanks so much for answering our questions. I was wondering what you thought of stochastic feedforward methods like Tang and Salakhutdinov presented at NIPS last year.
It seems to me like a great way to get some of the benefits of stochastic methods (especially the ability to predict at multiple modes) while retaining the efficiency of feedfoward methods that can be trained by backprop. It seems like there are some interesting parallels between this approach and the stochastic networks your lab has been working on, and I'd love to hear your thoughts on the comparison.
Thanks again!
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
I very much like their paper. We have been working on very similar stuff!
[–]nxvd 2 指標 12 月 前
Hello Dr. Bengio,
Thank you for your time. There are two questions I would like to ask you, if you don't mind:
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
I consider one of my greatest success is to have contributed to a collaborative, open, and collegial atmosphere in the lab. The common good is not an idle concept, here. It also helps to make students a lot more motivated, they enjoy their time here and contributing to group efforts.
[–]dhammack 1 指標 1 年 前
If I were summarizing the results from deep models, I'd say that deep models are excelling in problems that humans held the previous state-of-the-art (vision/audio/language).
Do you know of any successes in problems of the opposite nature; problems where statistical methods are already better than humans? One example I can think of is the Merk Kaggle challenge won by George Dahl, but I'd love to hear of some more.
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
Yes, I know of some such cases, in the realm of recommendation systems or fraud detection, when the number of input variables is large and cannot be easily visualized or digested by a human. Although I don't know of head-to-head comparisons with human performance, the sheer speed advantage makes it impractical to even consider humans for such jobs (except maybe to consider the few cases flagged by a machine).
[–]zach_will 4 指標 1 年 前
Hi Professor!
I always find myself resorting to ensembles and random forests in my projects (I think I can just internalize decision trees much better than deep learning). Could you offer the flip side for why I should be excited about neural networks?
(I mostly work with "medium-sized" data, and it usually fits on a single machine.)
Thanks!
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
I wrote some papers explaining why decision trees are doomed to generalize poorly:
http://www.iro.umontreal.ca/~lisa/pointeurs/bengio+al-decisiontrees-2010.pdf
The key point is that decision trees (and many other machine learning algorithms) partition the input space and then allocate separate parameters to each region. Thus no generalization to new regions or across regions. No way you can learn a function which needs to vary across a number of distinguished regions that is greater than the number of training examples. Neural nets do not suffer from that and can generalize "non-locally" because each parameter is re-used over many regions (typically HALF of all the input space, in a regular neural net).
[–]kablunk 3 指標 1 年 前
What are some things that self-taught machine learning scientists lack that those trained in a formal environment (university or similar) have?
(I'm asking as a member of the first group)
[–]SuperFX 4 指標 1 年 前
There seems to be a recent trend where a lot of deep learning researchers have moved to industry, ostensibly to gain access to very large data sets. Do you think deep learning research within academia can continue to flourish without such access? Or is the field invariably moving toward HPC and massive data sets as perquisites?
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
I think that there are plenty of huge datasets available for free out there. Think about all of wikipedia, all of youtube, etc. Not to mention: all of the internet.
Computing power is another question, but actually in some countries like Canada, the government is encouraging (or forcing) scientists to share computational resources. The result is that I have access to more computational power than most of my american colleagues. Plus, the cost of computing power continues to go down.
[–]javiermares 3 指標 1 年 前*
Professor Bengio,
What do you think of Ray Kurzweil's PRTM? Do you think any of its characteristics could be implemented on current deep learning techniques to improve their capabilities?
Thank you.
[–]yohamoha 3 指標 1 年 前
Hello, professor. I have a question that I always ask experts in their fields: In your field of study, what is the best book/paper you know of? Why? (here "best" can have any meaning, as long as it's specified)
Thanks.
[–]yoshua_bengioProf. Bengio 5 指標 12 月 前
There are too many good papers.
My students have put together a list of papers to read for the new students of the lab:
https://docs.google.com/document/d/1IXF3h0RU5zz4ukmTrVKVotPQypChscNGf5k6E25HGvA
[–]hltt 1 指標 1 年 前
Do you think of any other interesting deep learning approaches to NLP than Recursive Neural Network from Richard Socher ?
[–]rpascanu 1 指標 12 月 前
RNNs as in recurrent neural networks (e.g. Tomas Mikolov's work) are also very interesting IMHO.
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
Indeed.
[–][deleted] 1 指標 1 年 前
Hi professor Yoshua Bengio.
Do you think that machine learning as we understand it today will be the basis of future AI?
Which is a bigger obstacle to making AI stronger, hardware limitations or algorithmic/software problems? What is the biggest obstacle to making AI better in general?
What do you think of Ray Kurzweil's prediction that an AI will pass the Turing test by 2029? He has placed a bet on this prediction.
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
I won't bet on the year that AI will pass the Turing test, but I will certainly bet that machine learning will be a central technology to future AI.
The biggest obstacle to improving AI is to improve machine learning. To improve ML enough to get there, there are still many obstacles. Only some of them have to do with computing power. Others are more conceptual. For example I am convinced that there are still fundamental obstacles to learning the joint distribution of many variables for AI-like tasks. I also think that we have not even scratched the surface of the optimization challenges involved in training very large deep nets. Then there is reinforcement learning, which will be clearly necessary and on which advances are clearly needed (see the recent exciting work by the DeepMind people, on learning to play 80's Atari games, and presented at the Deep Learning Workshop at NIPS, which I organized).
[–][deleted] 1 指標 12 月 前
Thank you for your response.
[–]edersantana 1 指標 1 年 前
Which suggestions would you give to a young professor building a new research lab on machine learning, neural networks and such? What do you think are the most important aspects about lab environment, hardware and software resources? What about international cooperation? Also, How to be competitive worldwide?
[–]yoshua_bengioProf. Bengio 6 指標 12 月 前
Focus on your research.
Engage in collaboration and discussion with scientists from which you can learn.
Read. Read. Read.
Focus on your research.
Nourish your graduate students intellectually and at a personal relational level, like a father with his children.
Go to the best conferences of your field. Talk. Talk. Talk.
Keep thinking about the long term and steering back in the directions that you believe are promising, even though it's tempting to follow the trend and do incremental contributions. Believe in yourself.
Focus on your research.
[–]sixwings 1 指標 1 年 前
Professor Bengio,
Thank you for taking our questions. How do you respond to this criticism of Deep Learning from Jeff Hawkins:
Source: Deep Learning
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
See the replies below. There is plenty of deep learning work involving temporal structure. More will come, for sure.
[–]richardabrich 1 指標 1 年 前
Recurrent neural networks model temporal relationships implicitly. They're often used for speech recognition. There has been some work on deep recurrent neural networks. [1,2]
[1] http://www.cs.toronto.edu/~hinton/absps/RNN13.pdf
[2] http://papers.nips.cc/paper/5166-training-and-analysing-deep-recurrent-neural-networks.pdf
[–]rpascanu 1 指標 12 月 前
http://arxiv.org/abs/1312.6026.
RNN are also used in NLP. Some other interesting work that goes towards recurrent models (for scene parsing now) is this: http://arxiv.org/abs/1306.2795
[–]davidscottkrueger 1 指標 12 月 前
Of course, this cannot be taken as a valid criticism of the promise or potential of Deep Learning, because DL can account for the concept of time.
However, I think the point he is making about systems that interact with the world in real time vs. systems that don't is huge, and currently, DL's big successes are not in real-time applications.
I think a greater emphasis on real-time methods across the board would be a good thing. And I think that Reinforcement Learning will ultimately be more important than supervised/unsupervised learning.
[–]hf98hf43j2klhf9 1 指標 1 年 前
[META] In the comments at the verification page it looks like Yann LeCun is open to the AMA idea as well! Should we try to request him as well?
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
It would be fun.
[–]32er234 1 指標 12 月 前
You sending him an email will probably be more effective than all of us trying to bombard his social media pages ;-)
[–]IdoNotKnowShit 1 指標 1 年 前
Bonjour professeur Bengio! Thank you so much for this AMA! Here are a few questions of mine (not chosen i.i.d.):
Where does deep learning show promise? And in what application would it be an absolutely horrible choice?
Why do stacked RBMs work? Is this something that can be explained in a throughly formal manner or is there still some magic that needs to be unraveled?
What would you say is the relationship between ensemble learning and deeply layered learning?
Can you describe some of the work your lab/grad students is/are doing and why you support it?
What are some of the best things about living in Montreal?
How do you like to approach a research question? What kind of working environment do you prefer?
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前*
There is no such thing as magic, except in our emotional interpretation. I believe that I have a fairly rounded interpretation of why stacks of RBMs or regularized auto-encoders work so well. I have written about this, see in particular the 2013/2013 review paper with Courville & Vincent:
http://arxiv.org/abs/1206.5538
(also published in PAMI 2013)
I don't know of relationships between ensemble learning and deep layered learning besides the beautiful interpretation of dropout. For example, see http://arxiv.org/abs/1312.6197
My students have written a few words about studying in Montreal, for new graduate candidates:
http://www.iro.umontreal.ca/~bengioy/yoshua_en/index_files/open_positions.html
Montreal is a large city with 4 universities, a very rich cultural tradition, near nature, and where the quality of life (including security) is among the best (the 4th best in North-America, according to Mercer). Cost of life is substantially less than in other similar-sized North-American cities.
[–]moseconseco2 1 指標 1 年 前
Can you talk about the connection, if there is one, between big, structured knowledge projects like Google'sKnowledge Graph (built largely on the entity graph Freebase) and deep learning?
Is it significant that the data of the knowledge graph has this recursive network structure that looks a lot like the layers of abstraction in a deep learning setup?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
There is plenty of room in the Knowledge Graph project for machine learning, and so for deep learning. In particular, you want ML to help you guess the missing attributes of objects in the graph and even guess the missing relationships (so that you can even automatically insert new objects in the graph, based on some of their attributes).
[–]strayadvice 1 指標 12 月 前
This question is regarding deep learning. From what I understand, the success of deep neural networks on a training task relies on choosing the right meta parameters, like network depth, hidden layer sizes, sparsity constraint, etc. And there are papers on searching for these parameters using random search. Perhaps some of this relies on good engineering as well. Is there a resource where one could find "suggested" meta parameters, maybe for specific class of tasks? It would be great to start with these tested parameters, then searching/tweaking for better parameters for a specific task.
What is the state of research on dealing with time series data with deep neural nets? Deep RNN's perhaps?
[–]yoshua_bengioProf. Bengio 3 指標 12 月 前
Regarding the first question you asked, please refer to what I wrote earlier about hyper-parameter optimization (including random search);
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfq884k
James Bergstra continues to be involved in this line of work.
[–]rpascanu 2 指標 12 月 前
Here are a list of more recent work. The idea of Deep RNN's (or hierarchical ones) is older, and both Jurgen Schmidhuber and Yoshua have papers about it since the 90's.
http://arxiv.org/abs/1306.2795
http://arxiv.org/abs/1312.6026
http://arxiv.org/abs/1308.0850
http://papers.nips.cc/paper/5166-training-and-analysing-deep-recurrent-neural-networks.pdf
[–]james_bergstra 2 指標 12 月 前
I think having a database of known-configurations that make for good starting points for search is a great way to go.
That's pretty much my vision for the "Hyperopt" sub-projects on github: http://hyperopt.github.io/
The hyperopt sub-projects specialized for nnets, convnets, and sklearn currently define priors over what hyperparameters make sense. Those priors take the form of simple factorized distributions (e.g. number of hidden layers should be 1-3, hidden units per layer should be e.g. 50-5000). I think there's room for richer priors, different parameterizations of the hyperparameters themselves, and better search algorithms for optimizing performance over hyperparameter space. Lots of interesting research possibilities. Send me email if you're interested in working on this sort of thing.
[–][刪除] 12 月 前
[deleted]
[–]yoshua_bengioProf. Bengio 4 指標 12 月 前
Initially, 90% intuition, 10% math.
Then more math comes. Then you try it out and you find problems and you update your intuition and your math... etc.
And intuition comes from letting a problem sit in your head for a while, reading about it, asking yourself the question, working with it, talking with others about it, etc.
[–]32er234 1 指標 12 月 前
Is fluency in French a pre-requisite to becoming your student? Does it matter at all?
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
Not a pre-requisite at all. Most new students know very little or no French when I recruit them.
[–]32er234 1 指標 12 月 前
Given three candidates, none of which have much experience in ML, who would you rather chose as a potential student (other dimensions being equal):
Someone experienced in applied statistics (say, psychology research, or epidemiology), knows R
Someone who is very good at software development and knows some numpy/scipy, Matlab
Pure math undergrad who has little exposure to either programming or "real world" data
[–]yoshua_bengioProf. Bengio 2 指標 12 月 前
I can afford many students. I would not evaluate based on the above features but also based on an interview, in which all aspects come together. Strength in math is an excellent predictor of success in machine learning research, and so math undergrads with good programming skills are very high on my list of preferences. Strong software development is also very important for many of the projects we have, which involve big data and big models, where computational efficiency and top-notch collective programming are really important.
[–]andrewff 1 指標 12 月 前
I know I'm a little late to the party, but I was just wondering if you thought there was any room for an evolving topologies algorithm such as NEAT within deep learning? In some ways, techniques like dropout and dropconnect approach an evolving topolgy type methodolgy, but overall the idea of an evolving topology is not entirely captured by such techniques.
Thanks for doing this AMA!
[–]rishok 1 指標 12 月 前
Hello Prof. Bengio. I am a student from Denmark.
I am trying to add your Maxout Networks solution to the sparse autoencoder to see the potential benefits ... do you have any pre comment?
Can we be allowed to see more updates on your DL book .. hehe
[–]meiyordrummer123 1 指標 7 月 前
Hello professor Bengio I tried to run the Matlab toolbox that you have for DBN and I run at same time the Plearn app, but I want to know how can run a similar process between them?, because it is some options on plearn that are so different with the Matlab schemes and it would be useful to prototype a faster application.
Thank you
JMM
[–]sasaram 1 指標 1 年 前
Hi Prof. Bengio, very happy to see you here.
[–]yoshua_bengioProf. Bengio 3 指標 1 年 前
Recurrent nets are deep in the sense that the computation they perform (when you consider unfolding them in time) corresponds to a very deep network (albeit with shared weights across layers).
My definition of deep is that you have multiple levels of representation, with the i-th level obtained as a learned function of the representations at the lower levels. I also insist that the number of such levels be data-dependent, and I expect that higher-level representations capture more abstract features of the data which can only be obtained by the composition of the features at the lower levels, i.e., they are highly non-linear functions of the raw input.
[–]melipone 1 指標 1 年 前
Hi! No experience with deep learning, here. The introduction says that deep learning advances in machine learning can be used to solve artificial intelligence problems. Does that mean solving the consciousness/self-awareness problem or is it in a narrow sense?
[–]yoshua_bengioProf. Bengio 5 指標 1 年 前
Deep learning is not about consciousness or self-awareness but about something that I consider much more important from a practical point of view as well as much more challenging: allowing computers to understand the world around us. I believe that we will have fairly intelligent machines, that understand the world around us, but have no "consciousness" or rather no "self" in any way close to what humans have. Not because it would be difficult to introduce that, but because it would not be necessary in order to produce a lot of useful technology. Not to speak of the fact that once you introduce self in intelligent machines, you have to worry about Asimov's rules etc.
[–]anne-nonymous 1 指標 1 年 前
There are some robots who are self-aware :). Seriously.
http://www.scientificamerican.com/article/automaton-robots-become-self-aware/
[–]dnoup -2 指標 1 年 前
What exactly is deep learning and how it differ from conventional ML?
[–]yoshua_bengioProf. Bengio 3 指標 1 年 前
I have my definition above:
http://www.reddit.com/r/MachineLearning/comments/1ysry1/ama_yoshua_bengio/cfqay3e
Keep in mind that deep learning is part of machine learning.
[–]augustus2010 -3 指標 1 年 前
Could you explain Rationale behind sparse and deep learning?
[–]yoshua_bengioProf. Bengio 3 指標 1 年 前
I have already explained why deep learning is interesting. It is a broad prior and it brings both statistical and computational advantages, where it is an appropriate prior.
Sparsity is another prior: it assumes that for any given input, only a small subset of all the possible concepts known to the learner are relevant. Again, it is useful where it is applicable.
I believe that both priors are useful for many real-world problems where we want AI.
[–]melipone -9 指標 1 年 前
Did we get just 5 responses to ~100 questions?
[–]Noncomment 4 指標 1 年 前
I'm actually very impressed with this AMA. He answered almost all of the questions and put a lot of effort into the responses. Your comment was premature.
[–]vinnl -16 指標 1 年 前
Do you know all terms mentioned in questions here?