Home| Contact Us| New Journals| Browse Journals| Journal Prices| For Authors|

Print ISSN: 0974-7710
Online ISSN:
0974-7729


  About IJWA
  Aims & Scope
Editorial Board
Contact us
Current Issue
Next Issue
Previous Issue
Sample Issue
Be a Reviewer
Publisher
Subscription
 
  How To Order
  Order Online
Price Information
Request for Complimentary
Print Copy
 
  For Authors
  Guidelines for Contributors
Online Submission
Call for Papers
Author Rights
 
 
RELATED JOURNALS
Journal of Digital Information Management (JDIM)
Journal of E-Technology(JET)

 

 
International Journal of Web Applications

Bandit Based Pure Exploration of Greedy Learning
Tian Lin, Jian Li, Wei Chen
Tsinghua University Beijing, China., Microsoft Research Beijing, China
Abstract: Combinatorial optimization has many greedy algorithms. Using the stochastic based greedy algorithm, we have studied the online learning issues. Primarily, we compared the quasi- greedy regret as a metric for learning and equated with the offline greedy algorithms. Multi-armed bandit based pure exploration at each level of greedy learning is used to develop the two step online greedy algorithm with semi-bandit feed backs. This process works out for regret metrics well. The use of general class of combinatorial structures and reward function permit greedy solutions. We have tested the procedures with the parameters of other problems.
Keywords: Combinatorial optimization, Greedy algorithms, Learning Metrics Bandit Based Pure Exploration of Greedy Learning
DOI:https://doi.org/10.6025/ijwa/2023/15/1/1-11
Full_Text   PDF 2.41 MB   Download:   51  times
References:

[1] Audibert, J.-Y., Bubeck, S. (2010) Best arm identification in multi-armed bandits. In: Colt.
[2] Audibert, J.-Y., Bubeck, S., Lugosi, G. Minimax Policies for Combinatorial Prediction Games. arXiv Preprint ArXiv:1105.4871 (2011).
[3] Auer, P., Cesa-Bianchi, N. & Fischer, P. (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47, 235–256.
[4] Auer, P., Cesa-Bianchi, N., Freund, Y. & Schapire, R.E. (2002) The nonstochastic multiarmed bandit problem. SIAM Journal on Computing, 32, 48–77 [DOI: 10.1137/S0097539701398375].
[5] Björner, A. & Ziegler, G.M. (1992) Introduction to greedoids. Matroid Applications, 40, 284–357.
[6] Bubeck, S. & Cesa-Bianchi, N. (2012). Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems. arXiv Preprint ArXiv:1204.5721 (2012).
[7] Bubeck, S., Munos, R. & Stoltz, G. (2011) Pure exploration in finitely armed and continuous-armed bandits. Theoretical Computer Science, 412, 1832–1852 [DOI: 10.1016/j.tcs.2010.12.059].
[8] Cesa-Bianchi, N. & Lugosi, G. (2012) Combinatorial bandits. Journal of Computer and System Sciences, 78, 1404–1422 [DOI: 10.1016/j.jcss.2012.01.001].
[9] Chen, S., Lin, T., King, I., Lyu, M.R. & Chen, W. (2014) Combinatorial pure exploration of multi-armed bandits. In: Nips.
[10] Chen, W., Wang, Y. & Yuan, Y. (2013) Combinatorial multi-armed bandit: General framework and applications. In: ICML.
[11] Chvatal, V. (1979) A greedy heuristic for the set-covering problem. Mathematics of Operations Research, 4, 233–235 [DOI: 10.1287/moor.4.3.233].
[12] Gabillon, V., Kveton, B., Wen, Z., Eriksson, B. & Muthukrishnan, S. (2013) Adaptive submodular maximization in bandit setting. In: Nips.
[13] Gai, Y., Krishnamachari, B. & Jain, R. (2010) Learning multiuser channel allocations in cognitive radio networks: A combinatorial multi-armed bandit formulation. In: DySPAN. IEEE Publications, 1–9 [DOI: 10.1109/DYSPAN.2010.5457857].
[14] Garivier, A. & Cappé, O. The Kl-Ucb Algorithm for Bounded Stochastic Bandits and Beyond. arXiv Preprint ArXiv:1102.2490 (2011).
[15] Helman, P., Moret, B.M.E. & Shapiro, H.D. (1993) An exact characterization of greedy structures. SIAM Journal on Discrete Mathematics, 6, 274–283 [DOI: 10.1137/0406021].
[16] Kalyanakrishnan, S., Tewari, A., Auer, P. & Stone, P. (2012) Pac subset selection in stochastic multi-armed bandits. In: ICML.
[17] Kempe, D., Kleinberg, J., and ´ E Maximizing the spread of influence through a social network. In: SIGKDD (2003) [DOI: 10.1145/956750.956769].
[18] Korte, B. & Lovász, L. (1984) Greedoids and linear objective functions. SIAM Journal on Algebraic Discrete Methods, 5, 229–238 [DOI: 10.1137/0605024].
[19] Kruskal, J.B. (1956) On the shortest spanning subtree of a graph and the traveling salesman problem. Proceedings of the American Mathematical Society, 7, 48–50 [DOI: 10.1090/S0002-9939-1956-0078686-7].
[20] Kveton, B., Wen, Z., Ashkan, A., Eydgahi, H. & Eriksson, B. Matroid Bandits: Fast Combinatorial Optimization with Learning. arXiv Preprint ArXiv:1403.5045 (2014).
[21] Kveton, B., Wen, Z., Ashkan, A. & Szepesvari, C. Tight Regret Bounds for Stochastic Combinatorial Semi-bandits. arXiv Preprint ArXiv:1410.0949 (2014).
[22] Lai, T.L. & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6, 4– 22 [DOI: 10.1016/0196-8858(85)90002-8].
[23] Lin, T., Abrahao, B., Kleinberg, R., Lui, J. & Chen, W. (2014) Combinatorial partial monitoring game with linear feedback and its applications. In: ICML.
[24] Mirzasoleiman, B., Badanidiyuru, A., Karbasi, A., Vondrak, J. & Krause, A. (2015) Lazier than lazy greedy. In:. Proceedings of the AAAI Conference on Artificial Intelligence. Proceedings of the Conference on Artificial Intelligence (AAAI), 29 [DOI: 10.1609/aaai.v29i1.9486].
[25] Prim, R.C. (1957) Shortest connection networks and some generalizations. Bell System Technical Journal, 36, 1389–1401 [DOI: 10.1002/j.1538-7305.1957.tb01515.x].
[26] Streeter, M. & Golovin, D. (2009) An online algorithm for maximizing submodular functions. In: Nips.


Home | Aim & Scope | Editorial Board | Author Guidelines | Publisher | Subscription | Previous Issue | Contact Us |Upcoming Conferences|Sample Issues|Library Recommendation Form|

 

Copyright © 2010 dline.info